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xjy \ Abstract 

1 ^ I ' CTL is the dominant temporal specification language in practice mainly due to the fact that it admits 

model checking in linear time. Logic programming and the database query language Datalog are often used 
' as an implementation platform for logic languages. In this paper we present the exact relation between CTL 

^ ' and Datalog and moreover we build on this relation and known efficient algorithms for CTL to obtain efficient 

(N : algorithms for fragments of stratified Datalog. The contributions of this paper are: a) We embed CTL into 

STD which is a proper fragment of stratified Datalog. Moreover we show that STD expresses exactly CTL - 
we prove that by embedding STD into CTL. Both embeddings are linear, b) CTL can also be embedded to 
fragments of Datalog without negation. We define a fragment of Datalog with the successor build-in predicate 
' that we call TDS and we embed CTL into TDS in linear time. We build on the above relations to answer 

, open problems of stratified Datalog. We prove that query evaluation is linear and that containment and 

■"■^ — , ' satisfiability problems are both decidable. The results presented in this paper are the first for fragments of 

^ I stratified Datalog that are more general than those containing only unary EDBs. 

> ■ 

; 1 Introduction 
H ■ 

' Temporal logics arc modal logics used for the description and specification of the temporal ordering of events 
|Eme90| . Pnueh was the first to notice that temporal logics could be particularly useful for the specification 
and verification of reactive systems |Pnu77l IPnuSlj . In defining temporal logics, there are two possible views 
regarding the fiow of time. One is that of linear time; at each moment there is only one possible future (Linear 
Temporal Logic- LTL). The other is that of branching time (tree-like nature); at each moment time may follow 
diff'erent paths which represent different possible futures |EH86I ILamSOj . The most prominent examples of the 
latter are CTL (Computational Tree Logic), CTL* (Full Branching Time Logic), and /^-calculus. 

Deciding whether a system meets a specification expressed in a language of temporal logic is called model 
checking. Model checking is decidable when the system is abstracted as a finite directed labeled graph and 
the specification is expressed in a propositional temporal language. Model checking has been widely used for 
verifying the correctness of. or finding design errors in many real-life systems |CW96| . Through the 1990s. CTL 
has become the dominant temporal specification language for industrial use jVarOll ICGL93j mostly due to its 
balance of expressive power and linear model checking complexity. SMV |McM93j . the first symbolic model 

*Thc project is co - funded by the European Social Fund (75%) and National Resources (25%) - Operational Program for 
Educational and Vocational Training II (EPEAEK II) and particularly the Program HERAKLEITOS. 
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checker (CTL-bascd), and its follower VIS |BHS"'"96'] (also CTL-based), presented phenomenal success and serve 
as the basis for many industrial model checkers. 

The introduction of Datalog |U1188j represented a major breakthrough in the design of declarative, logic- 
oriented database languages due to Datalog's ability to express recursive queries. Datalog is a rule-based language 
that has simple and elegant semantics based on the notion of minimal model or least fixpoint. This leads 
to an operational semantics that can be implemented efficiently, as demonstrated by a number of prototypes 
of deductive database systems jNT89l |R,SS92I lELM"'"97j . Datalog queries arc computed in polynomial time; 
however, it has been shown that Datalog only captures a proper subset of monotonic polynomial-time queries 

|A(Jy9H . 

In order to express queries of practical interest, negation is allowed in the bodies of Datalog rules. Of 
particular interest is stratified negation, which avoids the semantic and implementation problems connected 
with the unrestricted use of nonmonotonic constructs in recursive definitions. In stratified Datalog |ABW88l 
IU1188I ICH85j negation is allowed in any predicate under the constraint that negated predicates are computed 
in previous strata. Simple, intuitive semantics leading to efficient implementation exists for stratified Datalog. 
Unfortunately, as shown in |Kol91| . this language has a limited expressive power as it can only express a proper 
subset of fixpoint queries. 

We have three major contributions in this paper. The first contribution is the definition of a fragment of 
stratified Datalog (the class STD) which has the exact expressive power as CTL f Theorem 14. We prove that 
by establishing a linear embedding from STD into CTL and vice versa. This is the first time that a fragment of 
stratified Datalog is identified which expresses exactly CTL. The definition of this fragment is simple and natural 
(see Subsection 14. If) . 

For our second contribution, we build on the above result to solve open problems of stratified Datalog. More 
specifically we prove that: a) query evaluation for STD is linear by reducing it to the model checking problem of 
CTL and b) both satisfiability and containment problems are deeidablc for STD programs by reduction to the 
validity problem of CTL. This is the first result that proves decidability of containment for a fragment of stratified 
Datalog which uses EDB (Extensional Database) relations other than unary and hence it has not a limited number 
of nontrivial strata. Checking containment of queries, i.e., verifying whether one query yields a subset of the 
result of the other, has been the subject of research last decades. Query containment is crucial in many contexts 
such as query optimization, query reformulation, knowledge-base verification, information integration, integrity 
checking and cooperative answering. Table ^ presents all known results on query containment for stratified 
Datalog including the results we obtain here. 

We also consider a fragment of a variant of Datalog without negation. We define the class TDS which is a 
fragment of Datalogs„cc and establish a linear embedding from CTL to TDS. Datalogs„cc is Datalog enhanced 
with the build-in successor predicate and allows negation only in the EDB predicates. The successor predicate is 
needed to express the universal quantifier which in stratified Datalog can be captured by using the full power of 
negation. Note that we use the conventional semantics of Datalog and this constitutes a contribution relatively 
to previous works |GFA A03j . This is the third contribution. 

1.1 Motivating Examples 

The following three examples illustrate some of the subtle points of the translation of a CTL formula into stratified 
Datalog and they are presented in order of increasing complexity. The subtleties in the case of Datalog5„cc are 
of similar nature. In all examples, we consider a Kripke structure /C, which is given by: a set of states W, the 
transition relation R on the states, and atomic propositions assigned to the states. 

Example 1.1 This is the first motivating example for our translation techniques. Consider the CTL formula: 
if = EQp- It says that, there exists a path starting from a state sq such that the next state on this path is assigned 
the atomic proposition p. We may view the Kripke structure as a database D with unary EDB predicates for the 
atomic propositions (here EDB predicate P is associated to p) and a binary EDB predicate R for the transition 
relation. Now the following Datalog program says that if sq is computed in the answers of the query predicate G, 
then there exists a path in D starting from sq which in one transition step reaches a state where P is true. 

r G{x) ^ R{x,y),Gi(y) 
\ Gi{x) «— P{x) 

▲ 
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f^tra f ifipH 

negation 


STD 

(Stratified negation with unary 
-|- 1 binary EDB predicates) 


S^frptifiprl npcrpifinn 

with unary 
EDB predicates 


Containment 


undccidablc 


rjAr^ i livirj— complete [ocction DJ 


decidable 
01SS93, HMSSOl 


Equivalence 


undccidablc 


EXPTIME-completc [Section 6] 


decidable 
.LMSS93. .HMSSOl. 


Satisfiability 


undccidablc 


EXPTIME-completc [Section 6] 


decidable 


Evaluation 


polynomial 


linear [Section 6] 


linear [Section 6] 



Table 1: Results on fragments of Stratified Datalog 



Whereas this is not a recursive program, when the formula contains the "until" modality, recursion is needed 
as is the case in the example that follows. 

Example 1.2 Consider now the somewhat more complex formula tp = EQp A E(gUi). The Datalog query that 
expresses this formula is the following. 





-Gi(x),G2(x) 


Gi{x) «■ 


— R{x,y),G-i{y) 


G3{x) ^ 


— P{x) 


G2{x) 


— G4{x) 


G2{x) «■ 


— G5{x),R{x,y),G2{y) 


Ga{x) «■ 


— T{x) 


I G5(x)^ 


— Q{x) 



This Datalog query expresses what the CTL formula says, i.e., there exists a path starting from a state Sq 
that is assigned p on its next state and there is also a path (different or the same) such that it is assigned q along 
all its states up until it gets to a state that is assigned t. The second and third rules express the CTL formula 
EQp, the four last rules express the CTL formula E(gUt) and the first rule asserts the conjunction ofEQp and 
E{qmy. A 

Now, there is a more complicated recursive case which requires a recursive predicate with two arguments and 
this is demonstrated in the third example. 

Example 1.3 Consider the CTL formula EDp which can also be written as E(_LUp). This formula says that 
there is an infinite path from a state sq so that p holds in all the states of the path. The existence of an infi- 
nite path on a finite Kripke structure is equivalent to the existence of a cycle. The following Datalog program 
expresses exactly this formula. In this program the rules with head predicate W are ancillary, they just say 
that X is an element of the domain [the EDB predicates Pi,0 < i < n, correspond to the atomic propositions). 
They are used to obtain safe rules and to express true and false - note that the second rule never fires, hence 
G2 expresses false andGi expresses true. Thus the rules that express the essential meaning of the formula are 3-8. 



^It is easy to observe that this particular Datalog program can be equivalently written using fewer rules. However we have written 
it here in the form derived by our algorithm. 
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Gi{x) i — W(x} 
G2{x) < — W(x),^Gi{x) 
Gsix) ^ P{x) 
G{x) ^ G2{x),G:i{x) 
G(x) < — B(x,x) 
G{x)^G3ix),R{x,y),G{y) 
B(x,y) ^ G3{x),R{x,y),G3iy) 

B{x,y) < G3{x),R(x,u),B{u,y) 

W{x) < R{x,y) 

W{x)^R{y,x) 
W(x) ^ Po{x) 

W{x) ^ P„{x) 

The two rules {7th and 8th) that compute B (combined with the third rule) actually compute the transitive 
closure of R over states where P is true. The fifth rule says that the formula holds if there is a cycle starting 
from state Sq with P assigned to all its states. The sixth rule says that the formula holds if there is a path which 
is followed by a cycle from a state sq with P assigned to all their states. ▲ 



1.2 Technical Challenges 

The examples illustrated the part of our contribution that translates a CTL formula to a Datalog query. However 
there are a few technical challenges that do not show on these examples: 1) By a straightforward translation some 
Datalog rules might not be safe (i.e., they may have variables that do not occur in nonncgatcd body subgoals). 
Thus we introduce a number of rules which essentially define the domain by an IDB (Intentional Database) 
predicate which is used in rules for safety - this shows a little in Example 1 1.31 2) Trying to identify a fragment of 
Datalog with exactly the same expressive power as CTL and use this fact to prove results for this fragment, we 
have to deal with the fact that CTL is interpreted over infinite paths. This means that finite Kripke structures 
over which we interpret CTL have to be total on the binary relation R. Relational databases however over which 
Datalog programs are interpreted do not have any constraints, i.e., the input could be any structure of the given 
schema. A solution to this kind of problem that is suggested in the literature |Eme90| is to add a self loop in 
those nodes that do not have a successor in R. We adopt a similar solution only that we encompass it in the 
definitions of the Datalog fragment we define, allowing thus for any input database to be captured. The example 
that follows explains further this point. 

Example 1.4 Consider the following Datalog program 

A{x) ^ R{x,y) 
G{x) < — -^A(x),P{x) 
G{x)^ R{x,y),P{y) 

It is easy to see that it returns the same answer on any pair of databases which only differ in adding self loops 
in R on nodes that do not have a successor in R. A 



Finally, our results go through because CTL has the bounded model property which means that if there is a 
model for a CTL formula then there is a finite model. Since in CTL infinite models arc also assumed, in order 
to carry over results to Datalog where finite input is assumed, we make use of this property. 

The rest of the paper is organized as follows. Sections 2 and 3 are preliminary sections that define formally 
CTL (Section 2), Datalog, Datalogs„cc and stratified Datalog (Section 3). Section 4 presents the formalism of 
our translation, discusses the notion of equivalence between CTL formulae and Datalog queries and defines the 
class of Stratified Temporal Datalog (STD) programs which is a fragment of stratified Datalog. The embedding 
from CTL to STD is also presented in Section 4. Section 5 gives the embedding from STD to CTL which is 
not straightforward so a discussion on the technical challenges of this embedding is also included. In Section 6 
we prove that query evaluation for STD programs is linear and that checking containment and satisfiability is 
decidable. The embedding of CTL into Datalogsucc is presented in Section 7. Finally, Section 8 shows how the 
present work can be extended to infinite structures and discusses possible future research directions. The proof 
of Theorem 17. II is presented in the Appendix. 
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1.3 Related Work 



Model checking is closely related to database query evaluation. The idea is based on the principle that Kripke 
structures can be viewed as relational databases |IV97| . One effective approach for efficiently implementing model 
checking is based on the translation of temporal formulae into automata and has become an intensive research 
area |WVS83| IVW86I IVW94j . Another approach consists in translating temporal logics to Logic Programming 
|Llo87| . Logic Pro gramming has been successfully used as an implementation platform for verification systems 
such as model checkers. Translations of temporal logics such as CTL or /i-calculus into logic programming can 
be found in |RRR+97l ICDD+98l IHPOS] . |CDD+98| presents the LMC project which uses XSB, a tabled logic 
programming system that extends Prolog-style SLD resolution with tabled resolution. 

The database query language Datalog has inspired work in |GC!V02j . where the language Datalog LITE is 
introduced. Datalog LITE is a variant of Datalog that uses stratified negation, restricted variable occurrences 
and a limited form of universal quantification in rule bodies. Datalog LITE is shown to encompass CTL and the 
alternation-free /i-calculus. Research on model checking in the modal /i-calculus is pursued in |ZSS94| where the 
connection between modal //-calculus and Datalog is observed. This is used to derive results about the parallel 
computational complexity of this fragment of modal /i-calculus. 

In previous work j(;FAAn8| we showed that the model checking problem for CTL can be reduced to the query 
evaluation problem for fragments of Datalog. In more detail, |CFAA03j presents a direct and modular translation 
from the temporal logics CTL, ETL, FCTL (CTL extended with the ability to express fairness) and the modal 
//-calculus to Monadic inf-Datalog with built-in predicates. It is called inf-Datalog because the semantics differ 
from the conventional Datalog least fixed point semantics, in that some recursive rules (corresponding to least 
fixed points) are allowed to unfold only finitely many times, whereas others (corresponding to greatest fixed 
points) are allowed to unfold infinitely many times. The work in |AAP"'"03) . which is a preliminary version of 
some of the results presented here, embeds CTL into a fragment of Datalogsucc- 

We know that CTL can be embedded into Transitive Closure logic |IV97| and into alternation-free /i-calculus 
|Eme96| . In |GCV02| the authors observe that CTL can be embedded into stratified Datalog. In this paper it is 
the first time that the exact fragment of stratified Datalog with the same expressive power with CTL has been 
identified. 

Concerning containment of queries the majority of research refers to CQs. However there are important 
results concerning also Datalog programs. In l(J(jKV88l it was pointed out that query containment for monadic 
Datalog is decidable. The work in |Sag88| shows that checking containment of nonrccursive Datalog queries in 
Datalog queries is decidable in exponential time. In )("V97| it is shown that containment of Datalog queries in 
non-recursive Datalog is decidable in triply exponential time, whereas when the non-recursive query is represented 
as a union of CQs, the complexity is doubly exponential. In (LMSS93- .HMSSHl) authors proved that equivalence 
of stratified Datalog programs is decidable but only for programs with unary EDB predicates. Our results are 
the first that encompass also programs that contain binary EDB predicates. 

2 CTL 

2.1 Syntax and Semantics of CTL 

Temporal logics are classified as linear or branching according to the way they perceive the nature of time. In 
linear temporal logics every moment has a unique future (successor) , whereas in branching temporal logics every 
moment may have more than one possible futures. Branching temporal logic formulae are interpreted over infinite 
trees or graphs that can be unwound into infinite trees. Such a structure can be thought of as describing all the 
possible computations of a nondeterministic program (branches stand for nondeterministic choices) . Note that a 
time step is usually identified with a computation step (e.g., a clock tick in a synchronous design). The future is 
considered to be the reflexive future, it includes the present, and time is considered to unfold in discrete steps. 

CTL (Computational Tree Logic) |(]E81I IE(]82j is a branching temporal logic that uses the path quantifiers 
E, meaning "there exists a path" , and A, meaning "for all paths" . A path is an infinite sequence of states such 
that each state and its successor are related by the transition relation. The syntax of CTL formulae uses temporal 
operators as well. For instance, to assert that "property ip is always true on every path" or that "there is a path 
on which property ipi is true until ■i/'2 becomes true" one writes AOip and E(-0iU'02), respectively, where □ and 
U are temporal operators. Various temporal operators are listed in the literature as part of the CTL syntax. 
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However the operators Q ^-i^d U form a complete set from which we can express aU other operators. We give the 
syntax of CTL in terms of these two temporal operators and later we also use the operator U, which facilitates 
our translations. 

The syntax of CTL dictates that each usage of a temporal operator must be preceded by a path quantifier. 
These pairs consisting of the path quantifier and the temporal operator can be nested arbitrarily, but must have 
at their core a purely propositional formula. In the remaining of the paper AP denotes the set of atomic propo- 
sitions: {po,pi,p2, ■ ■ ■} from which CTL formulae are built. We proceed to the formal definition of the syntax of 
CTL. 

51. Atomic propositions, T and ± are CTL formulae. 

52. If f, Tp arc CTL formulae then so are ip Atp, (pM ip. 

53. If (p,ip arc CTL formulae then EQ'/', AQ'/', £((^11^/;), A((^UV') are CTL formulae. 

The semantics of CTL is defined over temporal Kripke structures. A temporal Kripke structure /C is a directed 
labeled graph with node set W , arc set R and labeling function V . K, need not be a tree; however, it can be 
turned into an infinite labeled tree if unwound from a sq (see |Eme9nj and |Var97| for details). Below we give 
the definition of temporal Kripke structures. 

Definition 2.1 Let AP be the set of atomic propositions. A temporal Kripke structure K, for AP is a tuple 
{W,R, V), where: 

u W is the set of states, 

• R (-W xW is the total accessibility relation, and 

• V : W > 2^^ is the valuation that determines which atomic propositions are true at each state. 

A finite Kripke structure IC is a Kripke structure (W, R, V) with finite W. I 

In Kripke structures the set of states W can be infinite. W as defined in Definition 12.11 may be of any 
cardinality. In this paper we are interested in relational databases, where the universe W is finite. Hence, our 
Kripke structures are finite. In CTL we are dealing with infinite computation paths, which means that in order 
for the accessibility relation R to be meaningful, R must be total ( jKVWOO] ) : 

yx3yR{x,y) (1) 

Definition 2.2 A path t: of IC is an infinite sequence sq, si, S2, . . . of states of W , such that R{si, Si+i), i > 0. 
We also use the notational convention tt' = s^, s^+i, 5^+2, ■ ■ • ■ I 

The notation IC,s \= ip means that "the formula ip holds at state s of IC\ The meaning of |= is formally 
defined as follows: 

Definition 2.3 

• 1= T and ^ _L 

• JC, s \= p <==^ p G ^(.5), for an atomic proposition p £ AP 

• IC,s \= -^(p IC,s ^ ip 

• K-, s \= (p W Tp <^===> IC, s ip or IC, s \= ^p 

• IC, s ip A ^JJ IC,s \= (p and K,s \= ip 

• IC, s \= E(^ there exists a path i: = sq, si, . . ., with initial state s = sq, such that /C, tt ^ 1^ 

• /C, s 1= A.Lp <==^ for every path 71 — 80,81,..., with initial state s = so it holds that IC,tt \== ip 

• /C, TT ^ Q)(p <==^ IC,TT^ ^ (p 
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• IC,7r \= LpU^ there exists i > such that JC, tt' ^ ijj and for all j, Q < j < i, K,, tt^ \= (p 

• /C, TT 1= LpXJt/j <;=^ for all z > such that /C, tt* ^ -0 there exists j, < j < i, such that /C, tt^ \= ip I 

A CTL state formula (p is satisfiable if there exists a Kripke structure IC ~ {W, V) such that /C, s |= p, for 
some s € W . In this case /C is a model of 93. If /C, s ^ (y3 for every s € W , then p is irwe in /C, denoted IC \= p. If 
/C ^ (/3 for every IC, then (/? is valid, denoted ip. If IC \^ p for every finite IC, we say that p is valid with respect 
to the class of finite Kripke structures, denoted [^f p. 

The truth set of a CTL formula p with respect to a Kripke structure /C is the set of states of IC at which p 
is true. We define formally the truth set as follows: 

Definition 2.4 { Truth set) Given a CTL formula p and a Kripke structure IC ~ {W,R,V), the truth set of p 
with respect to IC, denoted p[IC], is {s € W \ IC, s \= p} . I 

2.2 Normal Forms 

CTL formulae can be transformed in two normal forms: existential normal form and positive normal form. The 
translations we give in Sections ^ and [7| cover each of these two syntactic variations of CTL. 

2.2.1 Existential Normal Form 

In existential normal form negation is allowed to appear in front of CTL formulae. The universal path quantifier 
A is cast in terms of its dual existential path quantifier E using negation and the temporal operator U: A(?/'iU';/'2) 
becomes -iE(-i?/'iU-i'02)- The U operator was initially introduced in )Var98l IKVW00| as the dual operator of 
U. One can think of 'E{ipi\Jip2) as saying that there exists a path on which: 

(1) either 'ip2 always holds, or 

(2) the first occurrence of -i'ip2 is strictly preceded by an occurrence of ipi. 

In general^every CTL formula can be written in existential normal form using negation, the temporal oper- 
ators Oj U, U and the existential path quantifier E (without the universal path quantifier A). The syntax in 
this case is given by rules S'l-S'^ and Proposition 12 . II states formally the equivalence of the two forms. 

S'l- Atomic propositions and T are CTL formulae. 
83. If p, tp are CTL formulae then so are -ip, p /\^. 

S3. If p, V' are CTL formulae then EQV: E(v3U?/') and E(<^U'!/') are CTL formulae. 

Proposition 2.1 Every CTL formula p can he transformed into a CTL formula p' in existential normal form 
such that IC, s \^ p iff IC, s \= p' for every IC = {W, R, V) and every s G W . 

Proof 

The universal path quantifier A is expressed as follows: AQip is rewritten as -lEQj^ijj, A(?/'iU-02) as -iE(-i-0iU-i'02) 
and A{ipiUip2) as -^E{^ipiXJ^ip2) ■ The correctness of these transformations follows immediately from Definition 
12.31 Also _L can be viewed as an abbreviation of -iT. H 

The translation presented in Section 0] translates CTL formulae in existential normal form into stratified 
Datalog. As the universal quantifier is not used, stratified Datalog expresses nicely CTL formulae. 

2.2.2 Positive Normal Form |Var98j 

Every CTL formula can be equivalently written in positive normal form where negation is applied only on atornic 
propositions. However, to compensate for the loss of full negation we need to use also the temporal operator U. 
Every CTL formula can be written in positive normal form using negation applied only on atomic propositions, 
the temporal operators Qj U and U and both existential E and universal A path quantifiers. This is achieved 
by pushing negations inward as far as possible using De Morgan's laws and dualities of path quantifiers and 
temporal operators. The syntax of CTL in this case is given by rules S^'-Sg. 
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S". Atomic propositions, T and their negation are CTL formulae. 
82- If f/'j "0 ^-I'c CTL formulae then so are (p AiJ^, (pV i/j- 

S'l If (p,ij are CTL formulae then EQip, AQip, E(^UV'), A{ipVij), E(^UV') and A(^UV') are CTL formulae. 

Proposition 2.2 Every CTL formula if can be transformed into a CTL formula if' in positive normal form such 
that /C, s 1= iff IC, s \= (p' for every K = (W, R, V) and every s € W. 

Proof 

The proof can be found in |Var98| . H 

The translation in SectionHconsidcrs CTL formulae in positive normal form and translates them into Datalog 
enhanced with the Succ operator; the latter is needed to express the universal path quantifier. It turns out that 
in this translation there is no need for negation in recursively defined predicates. Table El presents the two 
normal forms in which a CTL formula can be written in, and the corresponding fragments of Datalog used for 
the translation. 



Translation 


CTL Normal Form 


Datalog 


[Section m 


Existential Normal Form 


Stratified Datalog 


[Section E] 


Positive Normal Form 


Datalog + Succ 



Table 2: Normal forms vs. Datalog fragments 



2.3 Model Checking and Complexity 

Model checking is the problem of verifying the conformance of a finite state system to a certain behavior, i.e., 
verifying that the labeled transition graph satisfies (is a model of) the formula that specifies the behavior. Hence, 
given a labeled transition graph IC, a state s and a temporal formula ip, the model checking problem for IC and (p 
is to decide whether IC,s ^ p. The size of the labeled transition system /C. denoted |/C|, is taken to be \W\ + \R\ 
and the size of the formula p, denoted \p\, is the number of symbols in p. 

For CTL formulae the model checking problem is known to be P-hard |Sch03j , something that makes highly 
improbable the development of efficient parallel algorithms. However, there exist efficient algorithms that solve it 
in 0(|A;;||(p|) time |CES86| . It is insightful to examine how the two parameters |/C| and \p\ affect the complexity. 
This can be done by introducing the following two complexity measures for the model checking problem (VW86| : 

• data complexity, which assumes a fixed formula and variable Kripke structures, and 

• program or formula complexity, which refers to variable formulae over a fixed Kripke structure. 

CTL model checking is NLOGSPACE-complete with respect to data complexity^ and its formula complexity 
is in 0(log|(/7|) space |Sch03| . Another important problem for CTL is the validity problem, that is deciding 
whether a formula p is valid or not. This problem is much harder; it has been shown to be EXPTIME-complete 
|Var97| . The following two theorems state known results of CTL on which we built in Section El to argue about 
stratified Datalog. 

Theorem 2.1 {Validity) \Var97\ The validity problem for CTL is EXPTIME-complete. 

CTL exhibits another important property, namely the bounded model property: if a formula p is satisfiable, 
then p is satisfiable in a structure of bounded cardinality.^ 

Theorem 2.2 [Bounded Model Property) \Eme90\ If a CTL formula p has a model, then p has a model with at 
most 2^'^^ states. 

^In real life examples the crucial factor is \K.\, which is much larger than 

^As M. Vardi remarks in IVar97l this is stronger than the finite model property which says that if ip is satisfiable, then ip is 
satisfiable in a finite structure. 



8 



3 Datalog 



Datalog |U1188| is a query language for relational databases. An atom is an expression of the form E{xi, . . . , Xr), 
where is a predicate symbol and xi, . . . ,Xr are either variables or constants. A ground fact (or ground atom) 
is an atom of the form £'(ci, . . . ,Cr), where ci, . . . ,Cr are constants. From a logic perspective, a relation E 
corresponding to predicate symbol E is just a finite set of ground facts of E and a relational database D is 
a finite collection of relations. To simplify notation, in the rest of this paper we use the same symbol for the 
relation and the predicate symbol; which one is meant will be made clear by the context. 

Definition 3.1 |DEGV01j A database schema D is an ordered tuple (W,Ei, . . . , En), where W is the domain 
of the schema and Ei, . . . , En are predicate symbols, each with its associated arity. 

Given a database schema J), the set of all ground facts formed from Ei, . . . , En using as constants the elements 
of W is denoted 7Yb(VF). A database D over D is a finite subset of TLb{W); in this case, we say that 1) is the 
underlying schema of D. The size of a database D, denoted \D\, is the number of ground facts in D. I 

Definition 3.2 A Datalog program 11 is a finite set of function-free Horn clauses, called rules, of the form: 

G{xi, ...,Xn)< — Bi{yi^i, . . . ,yi,„i), . . ■,Bk{yk,i, ■ ■ ■,yk,nj 

where: 

- Xi, . . . , Xn are variables, 

- yij 's are either variables or constants, 

- G{xi, . . . ,Xn) is a predicate atom, called the head of the rule, and 

- . . . , ?/i_„j), Bk{yk.i, ■ ■ ■ iVk.Tik) o,re atoms that comprise the body of the rule. I 

Predicates that appear in the head of some rule are called IDB (Intensional Database) predicates , while 
predicates that appear only in the bodies of the rules are called EDB (Extensional Database) predicates. Each 
Datalog program 11 is associated with an ordered pair of database schemas (Si, So), called the input-output 
schema, as follows: 2)^ and So have the same domain and contain exactly the EDB and IDB predicates of 11, 
respectively. Given a database D over Si the set of ground facts for the IDB predicates, which can be deduced 
from D by applications of the rules in 11, is the output database D' (over So), denoted n(Z3). Databases over 
Si are mapped to databases over So via 11. 

Definition 3.3 Given a Datalog program 11 we distinguish an IDB predicate G and call it the goal (or query) 
predicate ofli. Let D be an input database'^; The query evaluation problem for G and D is to compute the set 
of ground facts of G in 11(D), denoted Gii{D). I 

The dependency graph of a Datalog program is a directed graph with nodes the set of IDB predicates of the 
program; there is an arc from predicate B to predicate G if there is a rule with head an instance of G and at 
least one occurrence of B in its body. The size of a rule r, denoted |r|, is the number of symbols appearing in r. 

{rn 
... , the size of 11, denoted |n|, is |ro| + . . . + |r„|. 

Stratified Datalog 

Intuitively, stratified Datalog is a fragment of Datalog with negation allowed in any predicate under the con- 
straint that negated predicates are computed in previous strata. Each head predicate of 11 is a head predicate in 
precisely one stratum Hi and appears only in the body of rules of higher strata 11^ [j > i) jGGV02| . In particular 
this means that: 

1. If G is the head predicate of a rule that contains a negated 5 as a subgoal, then B is in a lower stratum 
than G. 

*In the sequel of the paper we assume without expHcitly mentioning it, that the input databases for a Datalog program 11 have 
the appropriate schema. 
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2. If G is the head predicate of a rule that contains a non negated J5 as a subgoal, then the stratum of G is 
at least as high as the stratum of B. 

In other words a program 11 is stratified, if there is an assignment str() of integers 0, 1, . . . to the predicates 
in n, such that for each clause r of 11 the following holds: if G is the head predicate of r and B a predicate in 
the body of r, then str{G) > str{B) if B is non negated, and str{G) > str{B) if B is negated. 

Example 3.1 For the stratified program: 

A< — -^B 
B < — -nC 
C < — D 

str{) is the following: str{G) = str{D) = 0, str{B) = 1 and str{A) = 2. A 

The dependency graph can be used to define strata in a given program. In the dependency graph of a strat- 
ified program 11, whenever there is a rule with head predicate G and negated subgoal predicate B^ there is no 
path from G to B. That is there is no recursion through negation in the dependency graph of a str atified pro - 
gram. The number of strata of 11 is denoted strata{I[). For more details on stratified Datalog see pi88llZCF+97| . 

DataIog5„cc 

Datalogsucc is Datalog where the domain is totally ordered and which uses the binary build-in predicate 
Succ{X,Y) to express that Y is the successor of X, where X and Y take values from a totally ordered do- 
main. Papadimitriou in |Pap85| proved that Datalogsucc captures polynomial time. 

Notice that the term "successor" is overloaded in the following sense. In the literature on CTL successor is 
used to refer to the second argument of R{x,y) and we say that y is the child of x. In Datalogs^cc the build-in 
predicate Succ means that an element is the successor of another element in the total order. Notice that both 
refer to the next element of some order but on a different relation. In the sequel of the paper when me mean 
the first we will use the term "successor in i?" while for the second we will use the term "successor build-in 
predicate". When we do not specify it should be evident from the context. 

Bottom-up evaluation and complexity 

The bottom-up evaluation of a query, used in the proofs of the main theorems of this work, initializes the 
IDB predicates to be empty and repeatedly applies the rules to add tuples to the IDB predicates, until no new 
tuples can be added |U1188I IAHV95I IZCF+97| . In stratified Datalog strata are used in order to structure the 
computation in a bottom-up fashion. That is, the head predicates of a given stratum are evaluated only after all 
head predicates of the lower strata have been computed. This way any negated subgoal is treated as if it were 
an EDB relation. 

There are two main complexity measures for Datalog and its extensions. 

• data complexity which assumes a fixed Datalog program and variable input databases, and 

• program complexity which refers to variable Datalog programs over a fixed input database. 

In general, Datalog is P-complete with respect to data complexity and EXPTIME-complete with respect to 
program complexity |Var82l IImm86| . Although there are different semantics for negation in Logic Programming 
(e.g., stratified negation, well-founded semantics, stable model semantics, etc.), for stratified programs these 
semantics coincide. Recall that a program is stratified if there is no recursion through negation. Stratified 
programs have a unique stable model which coincides with the stratified model, obtained by partitioning the 
program into an ordered number of strata and computing the fixpoints of every stratum in their order. Datalog 
with stratified negation is P-complete with respect to data complexity and EXPTIME-complete with respect to 
program complexity ^AEWSSj. An excellent survey regarding these issues is |DEGVOi] . 
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4 Embedding CTL to stratified Datalog 

In the present and next section we establish that there is a fragment of stratified Datalog which has the same 
expressive power as CTL. This fragment, which we define in Subsection l4.1l is called STD (for Stratified Temporal 
Datalog) . The following theorem is the result of the two main theorems of Sections 4 and 5 (Theorems 14.21 and 
15.2(1 and it states that CTL and STD have the same expressive power. 



Theorem 4.1 Consider the languages CTL and STD. The following hold. 

1. Let tC be a finite Kripke structure and ip a CTL formula. Then there is a relational database D and a STD 
program IT such that the following holds: 

ip[IC] = Gn{D) (2) 
Moreover D and IT are computed in time linear in the size of K, and if. 

2. Let D be a relational database and IT a STD program. Then there is a finite Kripke structure fC and a CTL 
formula tp such that the following holds: 

Gn{D) = ip[K] (3) 
Moreover K, and Lp are computed in time linear in the size of D and H. 

We start by giving the definition of the class STD in the following subsection together with some properties. 



4.1 The class STD 
4.1.1 Definition 

The programs of this class are built-up from: (a) a single binary predicate R and an arbitrary number of unary 
EDB predicates Pq, . . . ,Pn , and (b) binary and unary IDB predicates. One unary IDB predicate is chosen to be 
the goal predicate of the program. 

W{x) < R{x,y) 

, Q(^), TY(^) I W(x)* — R{y,x) 

The programs G{x) < — Pi{x) and { jjii > where ri" is an abbreviation for { W{x) < — Po(x) , are STD 



W(x) ^ P„(x) 

programs having G as the goal predicate. Inductively if 111,112 are STD programs with goal predicates Gi, G2 
respectively and with disjoint sets of IDE predicates (with the exception of A and W which are the same in all 
programs) then 11 is the union of the rules of Hi, 112 and one of the following five sets of rules - predicate names 
G and B are new. 



r G(x)< — W{x),^Gi{x) 
{ G{x) ^Gi{x),G2{x) 



G(x) < — G2{x) 
G(x)^Gi{x),Rix,y),G{y) 



G{x) < — -^A{x),Giix) 
G{x) ^ R{x,y),Gi{y) 
A{x) ^ R{x,y) 

G{x) ^ Gi{x),G2{x) 

G{x) < — G2{x),^A{x) 

G{x) < — B{x,x) 

G{x) ^G2{x),R{x,y),G{y) 

B{x,y) ^ G2{x),R{x,y),G2{y) 

B{x, y) < — G2(x), R{x, u), B{u, y) 

A{x) ^ R(x,y) 



Only the programs produced by the rules above are STD programs. 



4.1.2 Properties 

In the following paragraphs we provide some intuition about the IDB predicates of STD programs and we give 
a succinct way to refer to STD programs which reflects their connection to CTL. Finally we show that STD 
programs are stratified. 

Predicates A and W are auxiliary predicates denoting the "ancestor" relation and the "domain" respectively. 
The intuition behind the IDB predicates W,A and B, is the following: 
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• W{x) as defined by 11" says that x belongs to the domain of the database, i.e., appears in the relations that 
comprise the database. 

• A{x) asserts that state x has at least one successor. 

• B{x, y) captures the notion of a path from state x to state y, such that G2 holds at every state along this path. 
In view of the fact that G2 corresponds to a CTL formula (let's say -02), B{x,x) asserts the existence of a cycle 
having the property that ip2 holds at every state of this cycle. 

For a more succinct presentation and for ease of reference we use the program operators [■], /\[-, •], X[-], •] 
and 1J[-, ■] depicted in Figured where programs Hi and 112 are over disjoint sets of IDB predicates (except A 
and W which are the same always) and G and B are new predicate names. It is useful to note that using these 
operators, the class STD can be equivalently defined as follows: 

Definition 4.1 

• The programs G(x) < — Pi(x) and | ^l^-* * ^j^g STD„ programs having G as the goal predicate. 

• //Hi and 112 are STD„ programs with goal predicates Gi and G2 respectively, then [Hi], /\[Ili,Il2], Xpi], 
U [Hi, 112] and lj[ni,n2] are also STD„ programs with goal predicate G. 

• The class STD is the union of the STD„ subclasses: 

STD = [j STDn (4) 

n>0 

Example 4.1 Consider the STD program U = UUPi, 112], [Ha]], where 111,112 and II3 are the simple STD 

programs Gi{x) < — P{x), G2{x) < — Q{x) and 6*3(2;) < T{x), respectively. The rules comprising 11 are shown 

below (Gi and G5 are the goals of the subprograms lj[ni,n2] and [Ha]) : 



f G{x)^ 


G4(x),G5(x) 




Gs{x),^A{x) 




B(x, x) 


G{x)^ 


G5{x),R{x,y),G{x) 


B{x,y) ^ 


— Gs{x),R(x,y),G5{y) 


B{x,y) ^ 


— Gsix), R{x,u), B{u,y) 


Gi{x) ^ 


-Gi{x),G2{x) 


Giix) ^ 


- G2{x),^A{x) 


Gi{x) ^ 


— -^1 (-^i 


Gi{x) ^ 


-G2(x),R(x,y),G4(x) 


Bi{x,y) 


^G2(x),Rix,y),G2{y) 


Bi{x,y) 


< — G2ix),R{x,u),Bi(u, 


G5(x) ^ 


- W{x),^G3{x) 


Gi(x) ^ 


-Fix) 


G2(x) ^ 


-Q(x) 


Gaix) ^ 


-T{x) 


A{x)^ 


R(x, y) 



y IT- 

▲ 

The following proposition proves that the STD class is a fragment of stratified Datalog. 
Proposition 4.1 Every STD program is stratified. 
Proof 

Given that Hi, 112 are stratified programs, any set of rules that might be added to Hi, 112 in order to form 
program 11 according to Definition 14.11 preserves the stratification of the program. H 
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The query operators of the class STD 



n" = 



[nil = 



A[ni,n2 



UPi.ns] = 



U[ni,n2] = 



W{x) < R{x, y) 

W{x) < R{y.x) 

W{x) < Po{x) 

W{x) < P„{x) 

G{x) < W{x), ^Gi(x) 



Gi{x)M2{x) 



- Giix), -^A{x) 

- R(x,y),Gi(y) 

- R(x.,y) 

G2(x) 

Gi(x),R(x,y),G{y) 



) < Gi(x),G2(x) 

) < G2{x),^A{x) 

) < B(x,x) 

) ^ G2{x),R{x,y),G{y) 
.y) ^ G2(x),R{x,y),G2iy) 

■.V) < G2{x),R(x,u),B(u,y) 

) ^ R(x,y) 



Figure 1: These are the query operators used in the definition of the class STD. Hi and II2 are STD„ programs with 
goal predicates Gi and G2 respectively. G and B are "fresh" predicate symbols, i.e., they do not appear in Hi or II2. In 
contrast, A and W are the same in all programs. 11" is a convenient abbreviation of the rules depicted here. 



4.2 From CTL formulae to relational queries 

Embedding CTL into STD amounts to defining a mapping h (hf, hs) such that: 

1. hf maps CTL formulae into STD programs, that is given a formula Lp, hf{ip) is a program 11 with unary 
goal predicate G. 

2. hg maps temporal Kripke structures (on which CTL formulae are interpreted) to relational databases, i.e., 
hs{lC) is a database D. 



3. For this mapping it holds: 



ip[lC\ = Gnl^*), where H = hf{Lp) and D = h,{K) 



The correspondence of CTL formulae to Datalog programs is the core of both translations. The exact mapping 
hf of CTL formulae into STD programs is given below. Note that we use the operators of Figure to facilitate 
the reading and that Hi corresponds to subformula -0^, i = 1,2. 

Definition 4.2 Let Lp he a CTL formula and let po, . . . ,pn be the atomic propositions appearing in Lp. Then 
hf{Lp) is the STD„ program defined recursively as follows: 



1. If ip — Pi or ip = T , then hf{ip) is { G{x) < — Pi{x) 



G(x) < — Wix) 



respectively. 



2. If Lp — -itpi or Lp = ipi A ip2, then hf{Lp) is [Hi] and /\[Ili,Il2], respectively. 

3. If LP = EOi^i or LP = i;(?/>iU'02) or Lp = E{ipiiji)2), then hf{ip) is Xpi], U[ni,n2] and 0[ni,n2] 
respectively. I 

The following example illustrates the translation presented above. 

Example 4.2 Let us consider a CTL formula Lp that contains the modality U, e.g., -iEi{ipi\5'ip2). Then 
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f G(x)< — W(x},^Gi{x) 
Gi{x)^G2{x),G3{x) 
Gi{x) ^ G3{x),^A{x) 
Gi{x) < — B(x,x) 
Gi{x) ^ G3{x),R{x,y),Gi{y) 
hf{<p)= { B{x,y) ^ Gsix), R{x,y),G3{y) 
B{x,y) < — G3{x),R(x,u),B{u,y) 
A(x) i — R(x,y) 
Ha 
Ha 

where G2, G3, II2 and II3 are the goal predicates and the programs that correspond to subformulae ipi and '02; 
respectively. ▲ 

The construction of STD programs that correspond to CTL formulae can be performed efficiently. This is 
formalized by the next proposition; its proof is a direct consequence of Definition 14.21 

Proposition 4.2 Given a CTL formula ip, the corresponding STD program H, which is of size 0{\ip\), can be 
constructed in time 0{\ip\). 

4.3 From finite Kripke structures to relational databases 

In this section we show how finite Kripke structures can be seen as relational databases. Definition 14.31 states 
formally the details of this mapping. 

Definition 4.3 Let AP be a finite set {po, ■ • ■ ,Pn} of atomic propositions and assume that K, = {W, R, V) is a 
finite Kripke structure for AP. Then hs{]C) is the database {R,Pq, . . . ,Pn), where Pi ~ {s G W \ pi V{s)} 
contains the states at which pi is true {0 < i < n). 

Further, to /C corresponds the database schema Ti/c = (W, R, Pq, . . . , Pn), with domain the set of states W , one 
binary predicate symbol R and an arbitrary number of unary predicate symbols Pq, . . . , Pn.^ A database schema 
of this form, i.e., containing a single binary predicate symbol and having all other unary, is called a Kripke 
schema. I 

The following proposition is a straightforward consequence of Definition 14.31 

Proposition 4.3 A finite Kripke structure fC can be converted into a relational database D ~ hg(IC) of size 
0{\IC\f m time 0{\IC\). 

Notice that the relation R of hs{JC) is total. Moreover, every path so, si, S2, • • • of gives rise to the path 
So, si, S2, ■ • ■ in hs{IC) and vice versa: if sq, si, S2, ... is a path in hs{IC), then 

R{si,Si+i), for every i>0 (5) 

The next proposition states formally the basic property of B{x,x). 

Proposition 4.4 B{s, s) holds iff there exists a finite sequence of states Sq, ■ • ■ , in such that Sq = s„ = s 
and G2{si), for every i, < i < n. 

In the proof of the main result in this section f Theorem 14. 2|l we need the next proposition, which is basically 
just a simple application of the pigeonhole principle. 

Proposition 4.5 Let K. = {W, R, V) be a finite Kripke structure and let sq, . . . , Si, . . . , Sj, . . . , s„ be a finite 
path in K., where n > \W\. Then, there exists a state s GW such that Sj = Sj = s. 



^As we have already pointed out, for simplicity we use the same notation, e.g., R, Pq, . . . , Pn both for the predicate symbols and 
the relations. The context makes clear whether R, Pq, ■ ■ ■ , Pn stand for predicate symbols or relations. 
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The number n of the unary relations Pq, . . . , Pn is a constant of the problem. 
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4.4 Embedding CTL into STD 

We are ready now to prove the main result of this section, which asserts that the mapping from CTL formulae 
to STD programs we defined earlier. 

Theorem 4.2 Let K. be a finite Kripke structure and let D be the corresponding relational database. If (p is a 
CTL formula and H its corresponding STD program {see Definition \4-. ^ , then the following holds: 

^[IC] = Gn{D) (6) 

Proof 

To facilitate the readability of this proof, we use subscripts in the goal predicates to denote the corresponding 
CTL subformulae. That is we write GeQiP to denote that G is the goal predicate of the program corresponding 
to EQ)ip. We prove that JBJ holds by simultaneous induction on the structure of formula (p. 

1. If ip = p, where p £ AP, or ip = T, then the corresponding programs are those of Definition 14.21 fl'l: 

• lC,s \= p ^ p e V{s) 4^ P(s) is a ground fact of D <^ s e Gp{D). 

• AC, s ^ T ^ .s e =^ (by the totahty of R) there exists t € W such that {s,t) G i? => s £ 

(<^) s e Gt{D) ^ s e Wn" (D) => s appears in one of R, Pq, . . . , Pn ^ s e W ^ K., s \= T . 

2. If ip = -1-0 or (p = ijji A V-'2i then the corresponding programs are shown in Definition 14.21 f2). 

: {=>) /C, s ^ => /C, s ^ -itA ^ /C, s ^ -i/' => (by the induction hypothesis) s ^ G.,p{D) ^ s e G^{D). 
(<^) s e G^{D) ^ s ^ G,p{D) (by the induction hypothesis) IC, s ^ ^ ^ IC, s \= -^tp ^ IC,s \= ip. 

A : (^) IC,s ^ (p ^ IC, s 1= "01 and IC, s ^ V-'2 (by the induction hypothesis) s € G^^ (D) and 
seG^,{D)^se G^„ (D) n G.^, (D) ^ s e G^ (D) . 

(^) s e G^{D) ^ s e G,p^{D) n G^.^{D) ^ s e G^^{D) and s e G^.^{D) (by the induction 
hypothesis) /C, s ^ and A^, s ^ -02 =^ s H V'- 

3. If (y5 = EO'0: then the corresponding program is that of Definition 14.21 f 3^ . 

(=>) K,,s \= EOV" =^ there exists a path tt = sq, si, S2, . . . with initial state sg = s, such that M!, tt |= QV-" ^ 
/C, TT"'^ \= for the path tt^ = Si,S2, . . . ^ K, si \= ip =^ (by the induction hypothesis) si £ G^{D). 
Furthermore, from jSJl we know that i?(so, si) holds. From the second rule of 11^, by combining G^(si) 
with R{sq,si), we derive Gi^(so) and, thus, sq £ G^{D). 

(<=) Let us assume that s G Gip{D). From the rules of Tf^'^ there exists a si such that i?(s, si) and G^(si) 
hold. By the induction hypothesis we get /C, Si \= ^p. Let tt = 50:51752,... be any path with initial 
state Sq = s and second state si. Clearly, then /C, tt^ \= ip =^ IC,Tr \= Q)ip =^ IC, s \= tp. 

4. If ip = E(-0iU02), then the corresponding program is that of Definition 14.21 f 3) . 

(=>) AC, s 1= E(?/'iU'02) there exists a path tt = sqi ^i, S2, . . . with initial state sq = such that 
A:,7r* h= V'2 and /C, tt^ h V'l (0 < j < « - 1) ^ /C, h ^-2 and /C, s^- h V'l (0 < j < i - 1) ^ 
Si £ Gti,^{D) and G Gip^{D) (0 < j < i — 1) (by the induction hypothesis). From © we know 
that R{sr, Sr+i), < r < i. From the first rule of 11;^ : Gip{x) < — Gii,^{x) we derive that Gip{si). 
Successive applications of the second rule of 11^ : G^{x) < — G,/,j(x), R{x,y), G^{y) yield G^{si-i), 
G<p(si_2), . . . , G^{si), G^(so). Thus, sq G G^{D). 

(<S=) For the inverse direction, suppose that s G G^{D). The rules of imply the existence of a state Si 
(possibly Si = s) such that G^,^{si). In addition, there exists a sequence of states so = s, si, . . . , Si 
such that i?(sr,Sr+i) and G^^{sr) (0 < r < i). By the induction hypothesis we get that AC, Si \= 02 
and AC, Sj ^ "01 (0 < J < * — 1) (because ipi and ?/'2 are state formulae). Let tt = so, si, S2, . . . , Si, . . . 
be any path with initial segment sq, si, . . . , s^. Then, AC, tt* \= ip2 and AC, tt^ H V'l (0 < i < « — 1), i.e., 

AC, TT 1= 1^. 

^Recall that in this case relation R is total. Hence, s S and the first rule does not add now states to G^{D). 
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5. li (p = E(-0iUV'2), then the corresponding program is that of Definition 14.21 f3) . 

(=>) Recall from Section 2 that IC,s \= E{ijjiXJ'tj}2) means that there exists a path tt = so, si, S2, ■ ■ ■ with 
initial state sg = -^i such that cither (1) /C,7r* |= ip2, for every i > 0, or (2) /C, tt* |= V'l "02 and 
/C, TT-' ^ "02: < < i — 1. We examine both cases: 

(a) In the first case AC, s,; ^ ?/>2, for every i > 0. The induction hypothesis gives that Si e G^t^iD), for 
every i > 0. Let sq, si, S2, . . . , s„ be an initial segment of tt, with n > \W\. From Proposition 14. 51 we 
know that in the aforementioned sequence there exists a state t such that t = Sk — si, < k < I < n. 
Then Proposition 14 . 41 implies that {sk,Sk) G B{D). From the third rule of 11;^: Gtp{x) < — B{x,x), we 
derive that G^{sk). Successive applications of the fourth rule of Hip: Gip{x) < — G^^{x), R{x, y), Gip{y) 
yield G^{sk-i), G^{sk-2), ■ • ■ , Gy(si), G^{sn). Accordingly, so e G^{D). 

(b) In the second case, K,, ^ -01 A ■02 and A^, Sj h= '021 ^ J * ^ 1- By the induction hypothesis 
Si £ G^,^{D) and Sj S G^r,{D), < j < i. From the first rule of IIi^: G^{x) < — G^j (x), (2^)1 
derive that G^p{si). Successive applications of the fourth rule of 11^ : G^{x) < — G^r,{x), R{x,y),G^(y) 
yield G^{si-i), Gy(sj_2), . . . , G^(si), G^(so)- Therefore, so e G^{D). 

(^) For the inverse direction, suppose that so G Gip{D). We define Gip{D, n) to be the set of ground facts 
of the IDB Gip that have been computed during the first n rounds of the evaluation of the last stratum 
of the program 11;^. We shall prove that for every t G G^{D, n), there exists a path n = tQ,ti,t2, ■ ■ ■ 
with initial state io = such that ]C,tt \^ ip. We use induction on the number of rounds n. 

(a) If n = 1, then t must appear either due to the first rule of 11;^: G^{x) < — G^j (x), G^^ (x) or due 
to the third rule of 11^: G^{x) < — B{x,x)^, assuming B is in a previous stratum. Note that if B is in 
the last stratum, then, of course, t could not have appeared due to the third rule. In the former case 
t £ G,f,-^{DiQ) n G^,.-^{Djq); the induction hypothesis for ipi and 02 means that IC,t ipi A ip2, which 
immediately implies that /C, tt [= iy9 for any path t: — tQ,ti,t2, ■ ■ ■ with initial state to — In the latter 
case, {t,t) £ Bip{D) and, in view of Proposition 14.41 this implies the existence of a finite sequence 
tQ,ti, . . . ,tk, such that to ~ tk ~ t and AC, tj \^ tp2, < j < k. Consider the path tt ~ (to, ti, . . . , tfc)"; 
for this path IC,tt \= (p. 

(b) We show now that the claim holds for n + 1, assuming that it holds for n. Suppose that t first 
appeared in G^{D,n + 1) during round n + 1. This could have happened either because of the third 
rule: G^p{x) < — B{x,x) or because of the fourth rule: G^p[x) < — G^^{x), R{x,y),G^p{y). 

In the first case (<,t) G B^{D). Then Proposition 14.41 asserts the existence of a finite sequence 
tQ,ti, . . . ,tk of states, such that to = tk = t and /C,tj ^ ip2, < j < k. Consider the path 
TT = (to, ti, . . . , tfe)'^; for this path AC, tt ^ (p. 

In the second case, we know that G^^{t) and that there exists a ti such that i?(t, ti) and G(^(ti). By 
the induction hypothesis, we get that lC,t \= ij}2 and that K,, ti |= ip. Immediately then we conclude 
that AC, TT ^ for the path tt = to, ti, t2, . . . with to = t. H 

5 Embedding a fragment of Stratified Datalog into CTL 

In the previous section we defined a mapping from CTL to the class of STD programs. In this section we work on 
the opposite direction, that is we define an embedding from STD to CTL. We start with explaining the technical 
challenges of this embedding. 

5.1 Technical Challenges 

In Kripke structures the accessibility relation R is total and as a result the corresponding relational database 
contains a total binary relation R. Here lies the main problem when going from databases to Kripke structures: a 
database relation is not necessarily total. To overcome this problem we define the total closure i?* of an arbitrary 
binary relation R with respect to a domain W as follows: 

R* = RU {{x, x) \ X & W and ^y such that R{x, y)} (7) 

^Relation R is total, meaning that t G A;p{D), and, thus, t could not have appeared from an application of the second rule. 
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In simple words the above equation means that even when R is not total, we can still get a total relation by 
adding a self loop to the states that have no successors. Note that if R is already total then i?* = R. 

5.2 From STD programs to CTL formulae 

We define a mapping f = [fq, fd) such that: 

1. fq maps STD programs into CTL formulae, that is given a program 11 with unary goal predicate G, /^(n) 
is a CTL formula (p. 

2. fd maps relational databases to Kripke structures, i.e., fd{D) is a Kripke structure K,. 

3. For this mapping it holds: 

Gn{D) - (^[/C], where <p - /,(n) and /C = fd{D) 

The correspondence of STD programs to CTL formulae is given below (subformula V'i corresponds to sub- 
program Ili, i = 1,2). 

Definition 5.1 Given a STD„ program 11, /g(n) is the CTL formula defined recursively as follows: 

1. //n = { G{x) < — Pi{x) or n ~ I ^1^-^ * ^(^) ^ then fq{Tl) is pi and T, respectively. 

2. If TL = [Hi] or n = A[ni, n2], then /<j(n) is -i^pi and -01 A ijj2, respectively. 

3. If n = X[Ui] orU = UPi.Ha] or H = 0[ni,n2], then fq{U) is EQ^i, i;(V'iUV'2) and E{iPi\5il)2), 
respectively. I 

The following proposition asserts that the construction of CTL formulae that correspond to STD programs 
can be performed efficiently. Its proof is an immediate consequence of Definition l5.ll 

Proposition 5.1 Given a STD program 11, the corresponding CTL formula ^p, which is of size 0(|n|), can be 
constructed in time 0(|n|) . 

5.3 From databases to finite Kripke structures 

In this section we show how an arbitrary relational database can be transformed into a finite Kripke structure 
in a meaningful way. Definition 15.21 has the details of this transformation. 

Definition 5.2 Let D be a database over the Kripke schema T)ic = {U,R,Po, . . . ,Pn)- We define the domain 
W of D as follows: 

n 

W^{x€U\ R{x,y)}\J{xeU I Riy,x)}[j{x eU \ P,(x)} (8) 

1=0 

Let D* be the total database (i?*, Pqj • • • j Pn) , where _R* is the total closure of R with respect to W ; then fd{D) 
is the finite Kripke structure {W, i?*, V) for AP = {po, • • • ,Pn}, with V{s) = {pi G AP \ Pi{s)}. I 

fd{D) is well-defined because R* is total as required by Definition 12. II The next proposition follows directly 
from Definition 15.21 

Proposition 5.2 Let D be a relational database (i?, Pq, . . . , P„) over a Kripke schema and let W be the domain of 
D as defined by D can be transformed into a finite Kripke structure IC ~ fd{D) of size 0{\W\ + \R\) = 0{\D\) 
in time 0{\D\).'^ 

^Recall that the number n of the unary relations Pq, . . . , Pn is a constant of the problem. 
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The main result of this section is that the mapping f — (/g, fd) is such the following holds: Gn{D) = (p[IC], 
where if = /g(n) and /C = fd{D). Before proving that, we show that STD programs can not distinguish between 
a database D and the corresponding total database D*, i.e., are invariant under total closure. 

Theorem 5.1 If H is a STD program with goal predicate G and D a database with a Kripke schema, then 

Gn{D) = Gnp*) (9) 

Proof 

We prove that ^ holds by induction on the structure of the program 11. 

1. If n = { G{x) < — P,{x) , then s £ Gn{D) <^ Pi{s) is a ground fact of D ^ Pi{s) is a ground fact of 
^ s e GniD*). 



2. ifn = 




, then: 



(=^>) s e Wn"{D) => D contains a ground fact of the form Pi{s) or R{s,t) or R{t,s). Obviously, D* also 
contains this ground fact, which means that s S Wn"(-D*). 

(<;=) s G Wn"(i5*) ^ -D* contains a ground fact of the form Pi{s) or R{s,t) or R{t,s), or R{s,s). In the 
first three cases D also contains this ground fact; however, D may not contain a ground fact of D* 
that has the form R{s, s). li D does not contain R{s, s), this implies (recall the definition of i?*) that 
D contains a fact Pi{s) or a fact R{t, s) for some constant t, but does not contain any fact of the form 
R{s,u). But then we would have that s £ W^n"(-D) due to Pi{s) or R{t,s). 

3. If n = [Hi], then s G Gu{D) <J=> s G Wu"{D) and s ^ Gij,^ {D). Reasoning as above we conclude that 
s G Wn"{D) <i=> s G Wu"{D*)- Furthermore, by the induction hypothesis with respect to Hi, we get that 
sGGi„^(D)^sGGi„^ 

4. If n = /\ [El, 112], then s G Gn{D) s € Gi^^ (Z?) and s G G2n2 (-D) ^ induction hypothesis) 
s G Gi„^ (D*) and s G Gsn, (^*) ^ s G Gn(i^*)- ' 

5. If n = Xpi], then: 

Suppose that s G Gu{D); this is a result of either the first or the second rule of 11. If it is due to 
the first rule, then s G Gi^ (D) and D does not contain a ground fact of the form R{s,u), for any 
constant u. If it is due to the second rule. D contains a ground fact R{s,u), for some constant u, and 
wGGi„^p). 

In the former case, the induction hypothesis implies that s G Gi^_^{D^). Moreover, by construction 
D* contains the groimd fact R{s, s). Hence, s G GnC^**) because of the second rule of H. 
In the latter case, the induction hypothesis implies that u G Gi^^ (D*)- Taking into account that _D* 
contains R{s,u), we conclude that s G Gu{D*) because of the second rule of H. 

(<;=) Suppose that s G Gn Let us assume for a moment that s appears in Gyi{D^) due to an application 
of the first rule of H. This would imply that s ^ An (£**). But this is absurd because i?* is total by 
construction (i.e., Vs3ui?(s,u)) meaning that s G Ayi{D^). This shows that when evaluating H on 
"total" databases, such as D*, the first rule of H is redundant. Hence, s must appear in Gn(£'*) as 
a result of an application of the second rule of H. This means that D* contains a ground fact of the 
form i?(s, u), for some constant u (possibly s = u), and u G Gijj_^ (-D*). The induction hypothesis gives 
that u G Gijjj^ (£*) (*). If D contains the ground fact R{s,u), then s G Gu{D) due to the second rule 
of n. If however D does not contain the ground fact R{s, u), then by the definition of i?* we deduce 
that: (a) D contains no ground fact of the form R{s,v), meaning that s ^ Ayi{D) (**) and (b) the 
ground fact in is actually R{s, s), i.e., s ~ u, which, in view of (*), means that s G Gijj^ (D) (* * *). 
By (**) and {* * *) we conclude that s G Gu{D) due to the first rule of H. 

6. If n = U[ni,H2], then: 
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(=>) Suppose that s S Gu{D); from the rules of the program 11 we see that there is a Si (possibly Si = s) 
such that Si £ ^2^^ (-D). In addition, there exists a sequence sq = s, si, . . . , Si such that D contains 
the ground facts R{sr, Sr+i) and G Gin i^) (0 < r < i). By construction also contains the 
ground facts R{sr, Sr+i) (0 < r < i). Further, the induction hypothesis implies that Si £ G2-a^{D*) 
and Sr S Gi^^ (0 < r < i). Consequently, by successive applications of the second rule, we 
conclude that s G Gn(-D*). 

(<^) Suppose that s e Gy[_{D*). Consider a minimal sequence sq = s,si,...,Si (possibly = s) such 
that £)* contains the ground facts i?(sr,Sr+i), € Gi^^ (I?*) and G2„^{D*) {0 < r < i) and 
Si G G2n2 (-^*)- Database D also contains the facts R{sr, Sr+i) {0 < r < i). For suppose to the contrary 
that D does not contain R{sk, Sk+i), for some k, < k < i. This means that Sk ~ Sk+i = . . . = Si 
(recall Q), which in turn implies that Si ^ G2n, (-D*); i-c, a contradiction. Thus, we have established 
that D also contains the facts R(sr, Sr+i) {0 < r < i). Now, the induction hypothesis implies that 
s G G2n, (-D) and Sr G Gi^^ {D) (0 < r < z). Consequently, by successive applications of the second 
rule, we conclude that s G Gu{D). 

7. If n = U[ni,n2], then let Gu{D,n) and Gn{D*,n) be the sets of ground facts of G that have been 
computed during the first n rounds of the evaluation of the last stratum of 11 on Z? and D* , respectively. 
We shall prove that s G Gn{D, n) -i^ s E Gn{D*, n) using induction on the number of rounds n. 

• (^) Let s G Gu{D, 1); s appears due to one of the first three rules of 11. If it is due to the first 
rule: G{x) < — Gi(a;), G2(a:;), then s G Gi^^{D) n G2n2 (^) ^^'^ induction hypothesis pertaining 
to Hi and 112 gives that s G Gi^^ (-D*) H G2yi^{D*), which immediately implies that s G Gu{D*, !)• 
If it is due to the second rule: G(x) < — G2{x),^A{x), then s G G2n2 (£^) ^^'^ ^ does not contain 
a ground fact of the form R{s,u), for any constant u. The induction hypothesis with respect to 112 
implies that s G G2„^{D*). Moreover, by construction D* contains the ground fact R{s,s). Then, 
by the fifth rule of 11: B{x,y) < — G2{x), R{x,y),G2{y), {s,s) G Bn{D^) and, consequently, by the 
third rule s G Gn{D*,l). If it is due to the third rule: G{x) < — B{x,x), then (s,s) G Bu{D). 
This means that D contains a sequence of ground facts R{so, si), R{si, S2), R{sk, Sk+i) with 
Sj. G G2yi^{D), < r < fc + 1, and sq = Sfc+i = s. Using the induction hypothesis pertaining to 112 
we obtain G G2n2 (^*)' < r < fc + 1. Further, by construction Z?* contains all the facts of D and, 
therefore, (s, s) G Byi{D*). Finally, by the third rule we conclude that s G Gu{D^, !)• 

(<^) Let s G Gyi{D* , 1); s appears either due to the first or due to the third rule of 11. The totality 
of i?* precludes the use of the second rule. If it is due to the first rule, a trivial invocation of the 
induction hypothesis pertaining to Hi and 112 gives that s G GuiD, 1). If it is due to the third rule, 
then (s, s) G Byi{D*) and s G G2n, (-D*). We distinguish two cases, depending on whether D* contains 
the ground fact R{s, s) or not. Let us first consider the case where R{s, s) is in _D*. If R{s, s) is also 
in D, then, of course, (s, s) G Bn{D) and, consequently, s G Gu{D, 1). So, let us assume that D does 
not contain i?(s, s). This means that D contains no ground fact of the form R{s, u), for any u, or, in 
other words, that s ^ An{D). Then, if we apply the second rule of 11, using the induction hypothesis 
to derive that s G G2n (£>), we conclude that s G Gn(-D, 1). Let us now consider the case where 
R{s, s) is not in D*. This means that £)* contains a sequence of ground facts R{so, si), i?(si, S2), ■ ■ ■ , 
R{sk, Sk+i) with Sr G G2n2 (£**), ^ ^ 'f' ^ k + 1, and sq = Sk+i = s. Without loss of generality we 
may assume that this sequence does not contain any fact of the form R{u,u)^^. D also contains the 
facts R{sq, si), R{si, S2), . . . , R{sk, Sk+i)] for suppose to the contrary that one of these facts is not 
present in D. But then this "missing" fact must be of the form R{u,u), which is absurd. Hence, using 
the induction hypothesis to derive that s,. G G2n2 0<r<fc + l, we deduce that (s, s) G Byi{D), 
and, consequently, that s G Gu{D, 1). 

• We show now that the claim holds for n + 1, assuming that it holds for n. 

(=>) Suppose that s first appeared in Gu{D, n+1) during round 71+I. This could have happened due to 
one of the first four rules of 11. In case one of the first three rules is used, then by reasoning as above, we 

^''To see why, let us suppose that it contains the fact R{u, u). This means that the aforementioned sequence is R{so, si), R(si , S2), 
-R(si , u), -R(«, u), Si_(_3), i?(s;_|_3 , S;_|_4), -R(st, , Sfc_|_i). But then simply consider the sequence R{so, si), R{si, S2), 
R{si , u), R(u, S;_|_3), -R(s;_|_3 , S;+4), • ■ ■ , H(sfc, Sfe+i) that also gives rise to (s, s) g Bn^D*) without containing R{u, u). 
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conclude that s G Gu{D*, n+1). So, let us suppose that the fourth rule: G{x) < — G2{x), R{x, y), G{y) 
is used. This imphes that s G G2ji^ {D) and that there exists a si such that R{s, si) and si G Gyi{D, n). 
By construction _D* also contains R{s,si). Moreover, the induction hypothesis with respect to the 
number of rounds gives that si G GYiiD*,n) and the induction hypothesis with respect to 112 gives 
that s G G2n, Thus, by the fourth rule we derive that s £ Gn{D*, n + 

(4=) Suppose now that s first appeared in GYi{D*,n + 1) during round n + 1. This could have 
happened due to one of the first four rules of 11. In case one of the first three rules is used, then 
by reasoning as before, we obtain that s S Gn{D,n + 1). So, let us suppose that the fourth rule: 
G{x) < — G2{x), R{x, y), G{y) is used. This implies that s G G2yi^ (-D*) and that there exists a si such 
that i?(s, si) and si G Gn(-D*, n). The fact that s first appeared in Gn{D'^,n + 1) during round n + 1 
means that s ^ si because if s = si, then s would belong to GYiiD^,n). This in turn implies that 
D contains R{s, si). Invoking the induction hypothesis we get that si G Gn{D, n) and s G G2„^ (D). 
Thus, by the fourth rule we derive that s G Gyi{D, n + 1). 

The bottom-up evaluation of Datalog programs guarantees that there exists no G N such that 
Gn{D,no) = Gn{D,r) for every r > no, meaning that Gn{D) = Gn(-D,no). Similarly, Gn(-D*) = 
Gn{D\no) and, hence, Gn{D) = Gn{D*). H 

5.4 Embedding STD to CTL 

We now complete the proof that CTL has exactly the same expressive power with STD programs. The following 
result complements that of Section^ and it proves that there exists an embedding of STD to CTL. 

Theorem 5.2 Let D be a relational database over a Kripke schema and let IC he the corresponding finite Kripke 
structure. If II is a STD program and ip its corresponding CTL formula [see De{inition \5.1\ . then the following 
holds: 

Gn{D) = ^[K] (10) 

Proof 

From Theorem 15. II we know that Gn{D) = Gn(-D*). Further, we can show that Gn(-D*) — (p[IC] - the proof is 
identical to the proof of Theorem 14. 21 and is omitted. This completes the proof. H 

6 Stratified Datalog: an efficient fragment 

In Sections 0] and ]E\ we established the equivalence of CTL with STD. In this section we capitalize on this 
relation by showing that STD is an efficient fragment of stratified Datalog in the sense that: (a) satisfiability 
and containment are decidable and (b) query evaluation is linear. The only other fragment of stratified Datalog 
known to have "good" properties is presented in ILMSSQB] and [HMSSOl] where it is shown that satisfiability 
and equivalence are decidable for Datalog programs with stratified negation and unary EDB predicates. 

6.1 Query Evaluation 

Definitions 14.21 and 15.11 in essence provide algorithms for constructing a STD program which corresponds to a 
CTL formula and vice versa. Notice that this translation can be carried out efficiently in both directions. This 
is formalized by Propositions 14.21 and 15.11 which, together with Propositions 14.31 and 15.21 suggest an efficient 
method for performing program evaluation in this fragment. Suppose we are given a database D (with a Kripke 
schema), a STD program 11 with goal G and we want to evaluate G on D, i.e., to compute Gu{D). This can be 
done as follows: 

1. From n and D construct the corresponding ip and IC respectively. This step requires 0(|n| + |_D|) time and 
results in a formula ip of size 0(|n|) and a Kripke structure /C of size 0{\D\). 

2. Apply a model checking algorithm for K, and ip. The algorithm will compile the truth set </'[/C], i.e., the set 
of states of JC on which ip is true. According to Theorem 15 .21 ip[JC] is exactly the outcome of the evaluation 
of G on D. 
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Taking into account that the model checking algorithms for CTL run in 0(|/C||(/?|) time (see |VW86| '). we 
derive the following theorem. 

Theorem 6.1 Given a STD program H with goal G and a database D, evaluating G on D can he done in 
OdDllnl) time. H 

The above result establishes the existence of fragments of stratified Datalog where the problem of query 
evaluation has linear program and data complexity. 



6.2 Satisfiability 

In the following paragraphs we show that the problem of checking the satisfiability of a STD program is reduced 
to that of checking the satisfiability of a CTL formula. We start with the following corollary of Theorem l2.1l and 
on which we build later to argue about the satisfiability of STD programs. 

Corollary 6.1 The satisfiability problem for CTL is EXPTIME-complete. 

Definition 6.1 {Satisfiability for Datalog programs) An IDB predicate G of program 11 is satisfiable if there 
exists a database D, such that Gn{D) ^0. I 

Proposition 6.1 Let 11 &e a STD program with goal predicate G and let (p be the corresponding CTL formula; 
if is satisfiable iff G is satisfiable. 

Proof 

(^) Suppose that ip is satisfiable; then there exists a Kripke structure K, = {W,R,V), such that )C,s |= (yS, for 
some s G W. If AC is finite, then by Theorem 14.21 we obtain that s G Gn{D), where D is the database that 
corresponds to /C. If /C is infinite, then by Theorem 12 . 21 there exists a finite Kripke structure /C/ = (W/, i?/, Vf) 
such that /C/,s' ^ tp, for some s' £ Wf. Invoking Theorem 14.21 we derive that s' G Gn{D), where D is the 
database that corresponds to /C/. We conclude that in both cases G is satisfiable. 

Suppose now that G is satisfiable. This means that there exists a database D = {R, Po, . . . , P„) with domain 
W, such that Gu{D) ^ 0. Hence, by Theorem 15. 21 we obtain that ^p\K\ ^ 0, where /C is the finite Kripke struc- 
ture that corresponds to D. This implies that there exists a state s £ such that /C, s ^ i.e., (p is satisfiable. H 

Proposition Ifci . II provides proof only for the unary goal predicates. The following proposition deals with the case 
of the binary -B(a;, y) predicates. 

Proposition 6.2 Let 11 6e a STD program and let B be a binary IDB predicate of II; the satisfiability of B is 
reduced in polynomial time to the satisfiability of a unary goal predicate G of a STD program. 

Proof _ 

If n contains a binary IDB predicate B{x,y), then it has a subprogram 11' = lj[ni,n2]. Let (p, ipi and ip2 be 
the CTL formulae corresponding to 11', Hi and 112. According to Definition 15. II ip = E{ipi\Jip2)', consider now 
the CTL formula ip* = (p A ^E(TU'0i). Let 11" be the STD program corresponding to ip* and let G be the goal 
predicate of 11". But then B is satisfiable iff G is satisfiable. Finally, it is easy to see that the above reduction 
takes place in polynomial time. H 

In order to argue about satisfiability we have to argue about the satisfiability of every IDB predicate. A STD 
program may contain one of the following A, W, Gi and Bj IDB predicates. The first two predicates are trivially 
satisfiable. For the remaining two predicates Propositions 16.11 and l6.2l show that they are satisfiable. Thus, the 
following theorem is an immediate consequence of Corollarv l6.ll and Propositions 16. l1 and 16.21 

Theorem 6.2 The satisfiability problem for STD programs is EXPTIME-complete. 
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6.3 Containment 



Deciding the containment of STD programs can also be reduced to the problem of checking the implication of 
CTL formulae. First we give some basic definitions regarding the notion of containment for Datalog programs 
and CTL formulae. 

Definition 6.2 {Containment and equivalence of Datalog queries) Given two Datalog queries Hi and II2 with 
goal predicates Gi and G2, we say that Hi is contained in H2, denoted Hi C 112, */ (ind only if for every database 
D, Gijj^ (D) C G2-n_2 i^)- ni and 112 are equivalent, denoted Hi = 112, Hi C 112 o-nd 112 C IIi. I 

A similar notion of containment can also be cast in terms of truth sets of CTL formulae. 

Definition 6.3 [Containment of CTL formulae) Given two CTL formulae ipi and ip2, we say that tpi is con- 
tained in ip2, denoted ipi C </?2, if and only if for every finite Kripke structure IC, ipi[JC\ C (y92[/C]. I 

Suppose we have two CTL formulae ipi and ip2] we say that ipi implies <p2 if for every Kripke structure 
K. = (W, R, V) and for every s G W, IC,s \= ipi implies that JC,s \= ip2- If implies (p2, then formula (pi — > (p2 
is valid and vice versa. Hence, we use the notation ipi ^ ip2 to assert that ipi implies ip2 and h^/ </5i ^ (p2 to 
assert that ipi implies (p2 in finite Kripke structures. The following corollary follows directly from Theorem 12. II 

Corollary 6.2 (Implication) 

The problem of deciding whether a CTL formula ipi implies a CTL formula ip2 is EXPTIME-complete. 

The following proposition states that given two CTL formulae (pi, (^2, in order to check implication ^ (pi ^ (p2 
it is sufficient to check implication only on finite Kripke structures, that is \=f ipi —>■ ip2^ because as we have 
already said CTL exhibits an important property, namely the bounded model property (see Theorem 12. 2|l . 

Proposition 6.3 Given two CTL formulae (pi and (p2 the following are equivalent: 

1. ipi C ip2 

2. ^ (fil ^ ip2 

Proof 

(1 ^ 2) (^1 □ (p2 means that for every finite Kripke structure /C = {W,R, V), ipi{IC) C (p2{IC), which imphes that 
if s G fiif^), then s € ^2(1^) {s S W). Therefore, for every finite Kripke structure K. = (W, R, V) and for every 
s € W, IC,s ^ (fii implies IC,s ip2, that is |=/ ipi Lp2- 

It remains to consider the infinite case; we will prove that the next two assertions are equivalent: 

(a) 1=/ (pi ip2 

(b) \=(pi~^ip2 

It is obvious that (b) implies (a). To show that (a) also implies (b) we assume, towards contradiction, that 
(a) holds and (b) does not hold. This means that (pi A -^(p2 is satisfiable, i.e., it has a model K,. IC can not be 
finite because of (a). It must, therefore, be infinite. But then, from Theorem 12 . 21 we obtain that ipi A ^(^2 has a 
finite model /C/, which is a contradiction because of (a). 

(2 =^ 1) 1= <y9i — > (p2 means that for every Kripke structure IC = {W,R,V), IC |= lySi p>2- Consequently, for 
every s G W ^ /C, s |= (/^i implies that }Z^s\^ (p2-, or in other words, V3i(/C) C (p2{IC). Thus, (pi C p>2. H 

The following theorem is a direct consequence of CoroUarv 16 . 21 and Proposition 16.31 
Theorem 6.3 The containment problem for CTL formulae is EXPTIME-complete. 
The next theorem follows directly from Theorems 14.21 and 

Theorem 6.4 The containment problem for STD programs is EXPTIME-complete. 
Proof 

Let Hi, 112 be a STD queries with goal predicates Gi, G2 and let pi, p2 be the corresponding CTL formulae. We 
shall prove that Hi C 112 iff ^1 ^P>2- 

(^) Suppose that Hi !^ 112, but it is not the case that pi ^ p2, i-e., there exists a finite Kripke structure IC' 
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such that <^i[/C'] % (/32[/C']- Let D' be the database that corresponds to K! \ then by Theorem 14.21 we get that 
Gin^ (£>') 2 G'2n2 (-^')- But this is a contradiction because the fact that Hi C 112 imphes that for every database 
-D, Ginj C G2n, Thus, it must be the case that ipi C Lp2. 

(<J=) Suppose that ip\ C 1^92, but it is not the case that Hi C 112, i-c, there exists a database D' such that 
Gin (^') 2 ^2n, (^')- Let /C' be the finite Kripke structure that corresponds to D' . Theorem 15.21 imphes that 
<y'i[^'] % ^>i\}^'\; which is a contradiction because Lp\ □ means that for every finite Kripke structure /C, 
</'i(/C) C Lp2{IC). Hence, it must be the case that Hi C 112. ^ 

The foUowing theorem is an immediate consequence of Theorem 16.41 

Theorem 6.5 The equivalence problem for STD programs is EXPTIME-complete. 

7 Embedding CTL into Datalog5^ucc 

This section presents an embedding of CTL into a fragment of Datafogsucc that we cah the class of Temporal 
Datalog Successor (TDS) programs. When embedding CTL into stratified Datalog (Sections 4 and 5) we con- 
sidered CTL formulae to be written in existential normal form. In this section we present another aspect of the 
relation between CTL and Datalog. In particular, we consider CTL formulae written in positive normal form 
(see Section 2) and we give an embedding into Datalogs„cc- Datalogs„cc extends Datalog by assuming ordered 
domain. In order to express a CTL formula written in positive normal form in Datalogs^cc we need potentially 
visit all states of the database. For instance consider the CTL formula A(?/'iU'02)- This can be done by using 
the Succ predicate of Datalog5„cc to count the number of states of the structure reachable from a given state 
and check if this number exceeds the cardinality of the database, denoted by Cmax- 

Since Papadimitriou in |Pap85| proved that Datalog5„cc captures polynomial time we expect that there is an 
embedding from CTL to Datalogs^cc- In this work we give the exact translation rules. Recall that Succ{X,Y) 
means that Y is the successor of X, where X and Y take values from a totally ordered domain. We actually use 
the equivalent notation X + 1 for Succ and we make the assumption that the number of elements in the domain 
is given and denoted by c^ax- Note that we use the conventional semantics of Datalog (we compute Gn{D) by 
computing least fixed points) which constitutes a contribution relatively to work |CFAA03] . 

As already mentioned, to traverse all states of a Kripke structure we need the successor build-in predicate 
Succ. When traversing a path, the order of states is implicitly given by the succession of states on the path. 
However, this is not the case when we want to traverse the children of a certain state. So, for this embedding we 
have to assume that the set Wx of all children of a certain state x is totally ordered. 

In this paragraph we explain in detail how we use the Succ predicate to traverse the Kripke structure. 
Kripke structures in essence are directed labeled graphs. In finite Kripke structures every node has a finite 
branching degree. In other words for every x G W there exist k distinct elements j/o, ■ ■ • ,yfc-i of W such that 
R{x,yo), ■ ■ ■ , R{x,yk-i), for some fc G N that depends on x. When we give the translation rules of CTL into 
Datalogsucc it is necessary to capture the relation between a node and its successors. This can be achieved 
by introducing the pairwise disjoint relations S'o, . . . , Sk-i (where k is the maximum branching degree of /C) in 
the corresponding relational database D. These relations serve as a "refinement" of the accessibility relation R: 
R = lJi=cJ ^i- Hence, for every node x with k successors we may write iS'o(a;, yo), ■ ■ ■ , Sk-i{x, yk-i), instead of 
R{x, yo), . . . , R{x, j/fc-i), meaning that yo, . . . , yt-i are the 1**, . . . , k^^ children of x, respectively. It is easy to 
see that we can express S^s using R and Succ. 

In this translation, for simplicity reasons, we consider Kripke structures of outdegree at most 2. Such struc- 
tures can be described by the two disjoint relations S'o and 5*1 instead of R; So{x,y) {Si{x,y)) expresses that y 
is the first (the second child) of x. However, Si{x,y) may not be defined for every state x. Notice that due to 
the totality of R, So{x, y) is total i.e., VxElj/ So{x, y). 

7.1 The class TDS 

TDS programs are built-up from: (a) two binary (S'o,>S'i) and an arbitrary number of unary EDB predicates, 
and (b) unary and binary IDE predicates. A unary IDB is taken to be the goal predicate of the program. 

Definition 7.1 
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• The programs G{x) 



Prix), G{x) 



^Pi{x) and 




W{x) 



are TDS„ programs having goal predicate 



G. 



• If Hi and II2 are TDS„ programs with goal predicates Gi and G2 respectively, then /\ pi, 112], V[ni,n2], 
X3P1], Xvpi], Ug[ni,n2], ljy[ni,n2], Ug[ni,n2] and ljy[ni,n2] are also TDS„ programs with goal 
predicate G. 

• The class TDS is the union of the TDS„ subclasses: 



In the translation rales we use the notation X + 1 for the successor of X. The program operators A['i']' 
V[-, •], Xv[-], Ugl'i Uv['' ']' Ual'i ■] IJvI'' depicted in Figure^ capture the meaning of the logical 

connectives A,V and the temporal operators EQ, AQ, EU, AU, EU, AU, respectively. 11" is used again as an 
abbreviation for a set of rules. The IDB predicates W and B have the same meaning as in the STD programs. As 
already stated, we use Datalog with the successor built-in predicate (negation is only applied to EDB predicates). 
The successor is only required for formulae of the form A('i/;iU'02). In this case, the constant c^ax is a natural 
number greater than or equal to 1. Cj^ax is equal to the cardinality \W\ of the underlying temporal Kripke 

structure /C. The intuition behind operator IJy[-, •] is the following. The temporal operator AU holds on a state 
s if for any path with initial state s either: 

(1) it is a finite path, -02 holds on all its states and tpi holds on its last state, or 

(2) it is an infinite path and -02 holds on all its states. 

The first, third and fourth rule capture case (1) and they are similar to the rules of "until". The rest of the 
rules capture case (2). Predicate G{x,n) expresses the fact that all paths that start from state x and are of 
length less than or equal to n, either are assigned ^2 on all their states up until there is a state assigned tpi, 
or are assigned '!/'2 on all their states. The number c^ax denotes the maximum number of states. G{x,Cmax) 
establishes that all paths starting from x belong to either case (1) or case (2) above. If C{x,Cmax) holds, then 
all infinite paths starting from x, for which (1) above does not hold, have all their states assigned tp2- This is 
true because on a finite graph all paths of length greater than the number of its nodes contain a cycle. Prom the 
six rules with head C, the two last are initialization rules {x may have one child or two children). The other four 
assert that given any path tt of length n starting from x either: (a) il)i\Jil)2 is true on tt, or (b) '02 holds on all n 
states of TT. 

7.2 Translation rules 

In this section wc define an embedding from CTL formulae into TDS programs. This is done via a mapping 
h' = {h'j:, h'g) such that: 

1. h'j: maps CTL formulae into TDS programs; h'^[ip) is a program 11 with unary goal predicate G. 

2. h'g maps temporal Kripke structures to relational databases, i.e., h's{IC) is a database D. 

3. Por this mapping the following holds: 



The exact mapping h'j.- of CTL formulae into TDS programs is given below. We use the operators of Figure 
|2]for succinctness and assume that 11; corresponds to subformula V'i,* = 1,2. 

Definition 7.2 Let ip be a CTL formula and let po, . . . ,p„ be the atomic propositions appearing in ip. Then 
h'Jip) is the TDSn program defined recursively as follows: 




(11) 



n>0 



ip[JC] = Gn(D), where H = h'J^p) and D = h'^OC) 



L If ip = Pi or p = -^pi or ip = T , then h'j{p) is { G{x) < — Pi{x) , { G{x) < — ^Pi{x) and 
respectively. 
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2. If (p = V'l A V'2 or (/5 = t/)! V ■02; then h'j{Lp) is /\[Ili,Il2\ and Y[ni,n2], respectively. 

3. If f = EOV^i or If = AQV'i or cp = I]{tlji'Utp2) or ip = A{iljiUip2) or ip = E(?/'iU'02), or ip = A(?AiUV'2), 
then h)[^) IS Xgpi], Xvpi], U3[ni,n2], Uv[ni,n2], 03[ni,n2] and Uv[ni,n2], respectively. I 



APi.ns] = 
V[ni,n2] = 

Xvpi] = 
UaPi.ns] = 



The query operators of the class TDS 



- So{x,y) 

- Soiv, x) 

- Siix.,y) 

- Siiy,x) 

- Po{x) 

- P^(x) 
Gi(x),G2{x) 



UvPi^ns] = 



Gi(x) 

G2{x) 



- So(x, y), Gi{y) 

- Si(x,y),Gi{y) 

■ So{x,y),^2S{x),Gi{y) 

- Soix,y),Siix,z),Gi{y),Giiz) 
~ So{x, y),Si{x, z) 

G2(x) 

Gi{x),So{x,y),G{y) 
Gi{x),Si{x,y),Giy) 



G2{x) 

Gi{x),So{x,y),^2S(x),G{y) 
Giix), Soix, y), Siix, z), Giy),G{z) 
- So{x, y), Si(x, z) 



Gi{x),G2(x) 
B{x,x) 

G2{x),So{x,y),G(y) 
G2ix),Si(x,y),G{y) 

— G2{x),So{x,y),G2{y) 
~ G2{x),Si(x,y),G2{y) 

— G2{x),So{x,u),B{u,y) 

— G2{x), Si{x,u), B{u, y) 



Gi{x),G2(x) 

G2{'x)''So(x,y),^2S(x).G{y) 
G2ix),Soix, y),Siix, z), G{y),G{z) 

— G2{x),Soix,y),^2S(x),cly,n - l). n < c, 

— G2{x),So(x,y),Si{x,z),C{y,n - 1), 

C(z, n — 1), n < Cmax 

— G2{x),So(x,y),Si(x,z),G{y),Ciz,n - 1), 

— G2{x),Soix,y),Si{x,z),C{y,n~ l),G{z), 

n < Crnax 

— G2(x),So{x, y),^2S(x), G2{y) 

— G2{x),So(x, y),Si {x, z), G2{y).G2{z) 

— So{x, y), Si(x, z) 



Figure 2: These are the query operators used in the definition of the class TDS. Hi and TI2 are TDSn programs with goal 
predicates G\ and G2 respectively. G,B and C are "fresh" predicate symbols, i.e., they do not appear in Hi or 112. In 
contrast, W and 2S are the same in all programs. H" is a convenient abbreviation of the rules depicted here. 



Now we are ready to prove the main result of this section. 
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Theorem 7.1 Let fC he a finite Kripke structure and let D be the corresponding relational database. If (p is a 
CTL formula and 11 its corresponding TDS program, then the following holds: 

^[IC] = Gn{D) (12) 

Proof 

The proof of (|12|l is carried out by induction on the structure of the formula ip. The complete proof is presented 
in the Appendix. H 

7.3 Unbounded outdegree 

Our results can be easily extended to any Kripke structure with bounded outdegree. It is easy to show that even 
if we do not have a structure of bounded degree, we can use the order of the domain to express the universal 
quantifier. To do so we need the following built-in predicates: 

(1) 5*0(2;, y), which says that y is the first (or leftmost) child of x, and 

(2) Next(x, y), which asserts that y is the next sibling of x. 

For instance, the translation of A(-0iU-02) would be the following: 



r G[x)^ 


-G2(x) 




-Gi{x),So(x,y),Giy),B{y) 




- W{x),^N{x) 




- Next{x,y),G{y),B{y) 


. N{x) ^ 


— Next{x, y) 



In the above program W is the IDB predicate defined by 11" (see Figure EJ that asserts that x belongs to the 
domain of the database. 

8 Conclusions and Future Work 

We may express a CTL formula either by omitting the universal quantifier but allowing negation or by restricting 
negation to the prepositional atoms only and using the universal quantifier. The former yields an embedding 
into stratified Datalog and the latter into Datalogs„cc- Moreover we identify a fragment of stratified Datalog, 
called STD, with the same expressive power as CTL. For STD all the good properties of CTL can be carried 
over, as the translation is linear in the size of the formula and the Datalog program. Thus, we derive new results 
that prove the decidability of the satisfiability and query containment problems for STD programs by reducing 
them to the validity problem for CTL. We also prove that the query evaluation for STD programs can be done 
in linear time with respect to the size of the database and the query. 

In this paper we work with finite Kripke structures having a total accessibility relation. Our technique can be 
applied to infinite tree structures if we also consider greatest fixed points. The translation goes through as it is 
with the only difference that for the negation of the until operator we need to use greatest fixed point semantics 
(see |GFAA03] for details). In this case, the proof of the theorems given in this paper is similar except the 
argument related to the greatest fixed point semantics which however uses the same intuition. We make this 
remark because it gives helpful insight but in the present paper we focus on finite structures because the query 
languages in databases are applied on (and hence their semantics is restricted to) finite structures. 

In future work we plan to extend our approach to CTL* (FuU Branching Time Logic) |ES84I lEHSfij . CTL 
is a proper and less expressive fragment of CTL*. Although we believe that the extension is feasible, having 
considered and investigated the problem for a short time, we think that the translation of CTL* will introduce 
additional non-trivial complications. 
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9 Appendix: Proof of Theorem 17.11 
9.1 Preliminary results 

Before giving the proof wc introduce some useful notions. Recall that Kripke structures are, in general, directed 
labeled graphs and not necessarily trees. Nonetheless, it is convenient to view them as labeled trees, something 
that is achieved by unwinding the Kripke structure from a specific node s, which is designated as the root of the 
resulting tree. Technically, this can be done by using pairs from x N, where N is the set of natural numbers, 
instead of just nodes from W. 

Definition 9.1 Given a Kripke structure JC = {W,R,V), suppose that /C, s |= A(?/'iU'02). The U unwinding of 
JC from s, denoted , is the Kripke structure lyV' ,R' ,V') , where: 

1. W is the least subset of W x N such that: 

• (s, 0) e W and 

• if (s', n) G W, R{s', t) holds, /C, s' |= Vi A and /C, t |= V"! V Tp2, then {t, n + 1) e W 

2. R'{{s',n),{t,n+1)) holds iff R{s' ,t) holds, and 
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3. V'{t,n) = V{t). I 

Definition 9.2 Given a finite Kripke structure K, = (yV, R, V) , suppose that IC, s \= A(?/'iU'!/'2)- The U unwind- 
ing of K, from s, denoted ICY , is the Kripke structure {W , R' ,V') , where: 

1. W is the least subset o/ x N such that: 

• (s,0) G W and 

• if (s', n) e W, n<\W\-l, R{s\ t) holds, IC, s' ^ ip2 /\ -V'l and IC, t ^ ip2, then {t, n + 1) 

2. R'{{s',n),{t,n + 1)) holds iff R{s\t) holds, and 

3. V'{t,n) = V(t). I 

The U and U unwindings of a finite Kripke structure IC arc finite labeled trees. Moreover, if IC has branching 
degree two, then U and U are finite binary trees. Let (s',n) be a state of IC^ (or IC^). If there exists a state 
(t, n + I) such that R'{{s' ,n), (t, n+ 1)), then (s', n) is an internal node of ICY (or ICY); otherwise (s', n) is a leaf. 





y/2 (.v,,l) O i//„iff, (s,, 




(S4.2) 
(■'4.3) 9 V2 
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(b) 
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Figure 3: Examples of U and U unwinding. 

Example 9.1 Consider the Kripke structure shown in Figurel^Ja). The U unwinding of \{ilJi'Uip2) from sq is 
shown in Figure|3f6) and the U unwinding of A('ip3\Jip2) from si is depicted in Figure Etc). Note that in Fi gure 
IHIc) (54,3) is an internal node whereas (s4,4) is a leaf. ▲ 

Proposition 9.1 Let IC = {W, R, V) he a Kripke structure, let IC,s \= A(7/'iUV'2) O'nd let ICY he the U unwinding 
of IC from s. Then the following hold: 

1. If{s',n) is a leaf of ICY , thenlC,s' \=ip2- 

2. If {s',n) is an internal node of tCY , then K.,s' \= ijji A ^ 

Proposition 9.2 Let IC — {W,R,V) he a finite Kripke structure, let IC,s ^ A(-!/'iUV'2) and let ICY ^6 the U 
unwinding of IC from s. Then the following hold: 

1. If{s',n) is a leaf of ICY , then either 

(a) IC, s' 1=: V'l A i/'2, or 

(h) n ~ \W\ — 1, IC,s' 1= -itpi A V'2 and for every child t of s' , IC,t\= 'ip2- 

2. If {s' , n) is an internal node of ICY , then n < \W\ — 1 and also IC, s' \= -1-01 A "02- 

Proposition 9.3 Let IC = (VK, R, V) he a finite Kripke structure and let sq, . . . , Si, . . . , Sj, . . . , s„ he a finite 
path in IC (or in the corresponding database D), where n > \W\. Then, there exists a state s such that Si = Sj = s. 
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9.2 Proof of Theorem ITU 

We are ready now to prove that (|12|1 holds by induction on the structure of formula ip. To increase the readability 
of the proof, we use the subscripts in the goal predicates to denote the corresponding CTL formula. For instance, 
we write GeoiA denote that G is the goal predicate of the program corresponding to 'EQij]. We consider the 
two directions separately and begin by considering the direction. 

Proof (^) 

1. li Lp = p ov Lp = -ip, where p £ AP, or (/s = T, then the corresponding programs are those of Definition 
0(1)- Trivially, then: 

• K,s ^ p ^ p e V{s) ^ P{s) is a ground fact oi D ^ s e Gp{D). 

• K.,s \^ ^p ^ p ^ l/(,s) ^ P{s) is not a ground fact of 13 =^ s G G-,p{D). 

• IC, s \= T ^ s e W => (by the totality of R) there exists t e W such that (s,t) e Sq U Si ^ s e 
Wn'^iD) ^seGT{D). 

2. If = t/ii V 7/'2 or (/? = ■01 A ■02! then the corresponding programs are shown in Definition \7.2\ (2). Again, 
the next hold: 

• /C, s 1= ^ /C, s 1= Vi or /C, s \= V'2 (by the induction hypothesis) s G Gtj,^{D) or s G G^^{D) 
s G G-4,, (D) U G^,,iD) => s G G^(i?). 

• JC, s (f ^ JC, s \= ipi and /C, s ^ ^2 ^ (by the induction hypothesis) s G (D) and s G G^p^ (D) => 
s G G.^,(Z?) n G^,(D) ^ s G G^(Z?). 

3. If <y9 = EOV'j then the corresponding program is shown in Definition 17.21 f3'l . 

Let us assume that /C, tt |= for some path tt = sq, si, S2, . . . with initial state sq. Wc know that cither 
5*0(501*1) or 5*1(50,51) holds. Now IC,it \^ (p for the path tt = so,5i,S2, . . . ^ IC^tt^ h= "0 foi' the path 
TT^ = 5i, 52, . . . /C, 5i 1= -0 (by the induction hypothesis) 5i G G^:{D). From 11;^, by combining G^,(si) 
with one of 5o(5o, 5i) or 5*1(50, si), we immediately derive G;p(so) and, thus, sq G G^{D). 

4. If </j = AO'01 then the corresponding program is shown in Definition 17.21 ^3^ . 

Let's assume now that /C, tt |= </? for every path tt = 5o, si, 52, ... with initial state sq. It is convenient to 
distinguish two cases: 

(a) So has a left child sf, but not a right child. In this case IC,tt \= p for every path tt = sq, 5i, S2, ... 
with initial state so =^ /C,7r^'^ |= -0 for every path tt^'^ = 5^,52, . . . with initial state sf =4> /C, |= "0 

(by the induction hypothesis) 5^ G G^,{D). Moreover, in this case 5o(so,5f), ^25(5o) are true and, 
therefore, evaluation of the second rule of 11^ gives G^[sq) < — 5o(5o, 5^ ), ^25(5o), G^(sf ) 5o G G^{D). 

(b) 5o has both a left child and a right child sf'. Then JC^ir \= ip for every path tt = sq, 5i, 52, . . . 
with initial state 5o =^ /C, tt^^-^ |= V foi' every path tt^'-^ = sf , , . . . with initial state sf and /C, tt^^-^ |= V 
for every path ir^'^' = sf , Sj-, . . . with initial state /C, 5^ ^ V' and /C, ^ ^0 =5> (by the induction 
hypothesis) sf , 5^ G G,f,{D). Moreover, in this case 5*0(50, 5^ ), 5i(5o, sf ) are true and, therefore, evaluation 
of the third rule of 11^ gives G^(so) < — 5o(so,5f), 5i(5o,sf), G^(sf'), G^(5f) sq G G^{D). 

5. li (p = E('0iUV'2), then the corresponding program is this of Definition 17.21 f3'l. 

Suppose that /C, tt |= where the path tt is 5o, si, 52, . . . . We have to examine two cases: 

(a) /C, TT 1= 02 for the path tt = 5o, 5i, S2, . . . =^ /C, so |= V'2 =^ (by the induction hypothesis) 5o G G^^ (D) 
G<^(5o) (from the first rule Gip{x) < — G^2(a;) of 11;^) so G G^{D). 

(b) /C, TT* 1= 02 for the path tt' = s^, Si_|-i, 5^+2, . . . and /C, tt^ |= ipi for tt'-' ~ Sj, Sj+i, Sj^2i • ■ • (0 < J < 1) 
^ IC,Si \= -02 and /C, Sj ^ 0i (0 < j < i - 1) =^ 5^ G G^/,2 (Z?) and sj G G^^ (ZJ^:) (0 < j < i - 1) (by the 
induction hypothesis). Wc know that for every ?', < r < i, at least one of 5*0(5^,5^+1) or 5i(5r,Sr+i) 
holds. From the first rule Gip{x) < — G^ii^ix) of 11^ we derive that G^{si). Successive applications of the 
second {G^{sr) < — G^^{sr), 5*0(5^,5^+1), G^{sr+i)) and third rule (G<^(5r) < — G^^{sr), 5i(sr., 5^+1), 
Gi^(s,.+i)) of for every r, < r < «, yield G^{si-i), G^(si_2), . . . , G<^(5i), G;p(5o). Thus, 5o G G^{D). 
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6. li (fi = A(V'iU7/'2), then the eorresponding program is this of Definition l7.2l f3'). 

Let us assume now that JC^tt \= ip for every path tt = sq, si, S2, . . . with initial state sq. Consider the U 
unwinding ICY of IC from Sq and let {to,r) be any node of A^^; we shall prove that to G G^{D). This 
property of /C^ indeed implies the required result because (sqiO) is a node (specifically the root) of 
and, thus, Sq G Gip(D). To prove it, let Lt^ = (to, r), (ti, r + 1), . . . , (t„, r + n) be the longest path from 
(to , r) to a leaf {tn,r + n) of /C^ . We use induction on the length n of the path Lta . 

(a) If n = 0; then node (to,r) itself is a leaf. From Proposition 19. II we know that /C,to 1= ip2 and by 
the induction hypothesis (pertaining to formula 7/12) we get that to G G^p^iD). Then, from rule G^{x) < — 
G^^ix) of n,p, we derive that G'i^(to). 

(b) We show now that the claim holds for paths of length n + 1 , assuming that it holds for paths of length 
less than or equal to n. In this case node (to, r) is an internal node of /C^. From Proposition 19 . 21 we know 
that /C,to 1= V'l and by the induction hypothesis (pertaining to formula ij^i) we get that to G G^^{D). We 
focus on the case where node (to, r) has exactly two successors (tf, r + 1) and (tf , r + 1) in /C^ (the case 
where (to,r) has only one successor is easier). Since Lt^ has length n + 1, then both L^L and L^r have 
length at most n. Hence, by the induction hypothesis with respect to the length of the paths Lf.L and L^R^ 
we get that t{ e G^{D) and tf G G^{D). So Gv,i(to), S'o(to,tf ), S'i(to,tf), G^(tf') and G^(tf) are true 
and, therefore, the evaluation of the third rule of 11^ gives that Gip(to) < — G^j(to), S'o(to,tf), S'i(to,tf), 
G^(tf ), G^(tf). Thus, to G G^{D). 

7. If <y9 = E('0iU'(/'2), then the corresponding program is shown in Definition 17.21 (3) . 

Suppose that IC,tt \= where the path tt is sq, si, S2, ■ ■ ■ ■ We must consider two cases: 

(a) /C,7r' 1= ■01 A "02 for the path tt' ~ 5^,5^+1, Si+2, ... and /C, tt-' |= -02 for = -^ji^j+i^ ■■■ 
(0 < j < « - 1) =^ /C, ^ A 02 and /C, sj h V'2 (0 < j < « - 1) =^ G G^^{D) and Sj G G^^{D) 
(0 < J < i) (by the induction hypothesis). We know that for every r, < r < i, at least one of 5*0(5^, s^+i) 
or 5*1(5^,3^+1) holds. From rule G<^(a;) < — G^^ (a;), G^j (x) of 11;^ we derive that Gip{si). Successive 
applications of the other two rules of the program (i.e., G^p{sr) < — G^^i^r)-, So{sr, Sr+i), G^{sr+i) and 
G^{sr) < — Gv„(sr), Si{sr,Sr+i), G^{sr+i)) for cvcry r, < r < i, yield G^(sj_i), G^(sj_2), . . . , G;^(si), 
G^(so). Thus, so G G^{D). 

(b) IC, TT* 1= V'2 for the path tt' = s^, s^+i, Si+2j • ■ ■ j for every i > 0. This implies that /C, Sj |= 02, for 
every * > 0, and (by the induction hypothesis) that Si G G.02 {D), for every i > 0. Let sq, si, S2j • ■ • j s„ be 
an initial segment of tt, where n = \W\. From Proposition l9 . 31 we know that in the aforementioned sequence 
there exists a state s such that s = Sk ~ si, < k < I < n. Then (s^, s^) G B^,^{D). We know that for 
every r, < r < fc, at least one of So{sr, Sr+i) or S'i(sr,Sr+i) holds. From rule G^p{x) < — Bfp^{x,x) we 
derive that G^(sfe). Successive applications of the fourth {G^{sr) < — G^^{sr), S'o(sr, s^+i), G^p{sr+i)) or 
the fifth (G^(sr) < — G02(sr), S'i(sr, s^+i), G^(sr+i)) rule of 11^ for every r, < r < fc, yield G^{sk-i), 
Gip(sfe_2), . . . , G^(si), G^(so). Accordingly, sq G G^{D). 

8. If (/? = A(0iU'02), then the corresponding program is shown in Definition 17.21 i?t\. 

Let us assume now that /C,7r ^ </3 for every path tt = sq, si, S2, ■ ■ . with initial state sq. Consider the 
U unwinding /C^ of K, from sq and let (to , r) be any node of /Cj^. We shall prove that either Gip{to) or 

G02(to, — r) holds. This property of the nodes of ICY^ ensures that for the root (so,0) it must be the 
case that sq G G^{D) (recall the universal program in Section 5). To prove this property, let Lt^ = (to, r), 
(ti, r + 1), . . . , (t„, r + n) be the longest path from (to, r) to a leaf (t„, r + n) of /C^. We are going to use 
induction on the length n of the path Lf^ . 

(a) If n = 0, then node (to,r) itself is a leaf. We may assume that node to has exactly two successors 
tf and tf in D because the case where to has only one successor can be tackled in the same way. From 
Proposition l9 . 21 we know that there are two cases regarding to: (1) /C, to ^ Vi in this case the induction 
hypothesis (with respect to 0i and ^02) gives that to G G^^{D) and to G G^^i.^) and from the first rule 
of we derive that G^(to). (2) r = \W\ - 1, /C,to h "'^i ^ "02 and /C,tf P V'2 and /C,tf |= ip2- The 
induction hypothesis with respect to 1/^2 gives that to G G^^{D), tf G G^^{D) and tf G G^^{D). Using rule 
C^,(to,l) < — G^^[to), Soito,t^), 5i(to,tf), G^^{tf), G^^[tf) of we conclude that G^^{to,l) holds. 

(b) We prove now that the claim holds for paths of length 71 + 1, assuming that it holds for paths of length 
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less than or equal to n. In this case node (^o, r) is an internal node of . From Proposition 19 . 21 we know 
that r < \W\ — 1 and /C,to \= A ip2 and by the induction hypothesis (pertaining to '!/'2) we get that 
to S {D). We examine the case where node (io, f ) has exactly two successors (if, r + 1) and (tf , r + 1) 
in /C^, where r + 1 < In this case <S'o(ioi ^f); »S'i(to, if) are true. Since Lt„ has length n + 1, then both 
L^L and L^h have length at most n. Hence, by the induction hypothesis (regarding the path length), we 
get that G^{t{) ot: C^^{t{, |I^|-r-l) andG'^(tf) ovC^^{t^, \W\-r^l). liG^{t{) andG^(tf) are true, 
then the third rule oili^ {G^{x) < — 602(2:), So{x,y), Si{x,z), G^(jj), G,^(z)) implies that G^{to) also 
holds. If (^fi 1^1 —r—l) and C^^ 1 1^1 —r—l) are true, then using the seventh rule of n^(C02 {x, n) 
< — G.02(a;), So{x,y), S'i(x,z), C^p^iy^n — 1), G^2(-^;'^ ~ n < \W\) we conclude that G,p^{t(), \W\ — r) 
also holds. In the remaining two cases the eighth and ninth rule imply that C^^{tQ, \ W\ — r). 

We have proved that for the node (sq, 0) one of G^(so) or 6^2(^0, \W\) holds. If we assume that G^2 (*0j \W\) 
holds, then the fifth rule of 11;^ implies G^(so). Hence, in any case, sq S Gip{D). H 

We complete now the proof of p2|l by examining the opposite direction. 



Proof {■i=) 

1. If ip ~ p or if = -ip, where p S AP, or (/s = T, then the corresponding programs are those of Definition 
0(1)- Trivially, then: 

• s e Gp{D) =^ P{s) is a ground fact of £> ^ p G V{s) ^ JC,s \= p. 

• s e G-,p{D) ^ P{s) is not a ground fact D ^ p ^ y(s) /C, s |= ^p. 

• s e Gt{D) => s e VFn" {D) => s appears in one of So, Si, Pq, . . . , Pn ^ s e W ^ K., s \== T . 

2. If (/3 = t/ii V 7/'2 or (/? = ■01 A ■02 , then the corresponding programs are shown in Definition 17.21 ('21 . Again, 
the following hold: 

• s e G^{D) ^ s e G^^{D) U G^,.^{Dk) ^ s e G^,^{D) or s e G.ti,^{D) => (by the induction hypothesis) 
/C, s 1= Tpi or /C, s 1= V2 => 1^, s \= (p. 

• s e G^{D) =J> s G G^^{D)C\G^^{D£) ^ s E G^^{D) and s G Gtp^{D) ^ (by the induction hypothesis) 
/C, s 1= -01 and IC, s \= %Ij2 ^ 1^, s \= Lp. 

3. If = EQV'i then the corresponding program is shown in Definition 17.21 (i) . 

Let us assume that sq € G^{D). From the rules of the program li^ we see that there exists a si such that 
G.0(si) and also one of Sq{sq,si) or Si{sq,si) holds. By the induction hypothesis we get /C, si \= ip. Let 
TT = sq, si, S2, . ■ ■ be any path with initial state sq and second state si. Clearly, then IC, tt^ \= t/j for the 
path TT^ ~ si, S2, . ■ . and IC,Tr \^ ip for the path it = sq, si, S2, . ■ ■■ 

4. li ip = AQiTp, then the corresponding program is shown in Definition 17.21 (3) . 
Suppose now that sq G G^{D). It is convenient to distinguish two cases: 

(a) So bas a left successor but not a right successor in D; in this case 5*0(30, ) and ^25(50) are true. 
From the second rule of H;^ we see that G,f,{si) holds. By the induction hypothesis we get IC, |= ip. Let 
TT = So, si, S2, • ■ • be an arbitrary path with initial state so. The fact that so has a left successor but not 
a right successor implies that sf is the second state of every such path. Suppose that there exists a path 
TT = So, sf , S2, . . . with initial state so such that IC,7t ^ ip. Trivially then IC, tt^ ^ i/j, where tt^ — Si , S2, . . ., 
which in turn implies that IC, sf ^ ^, which is false. 

(b) So has both a left successor sf and a right successor sf in D; in this case S'o(so, sf) and 5*1(30, sf) 
are true. From the third rule of H;^ we see that G^(sf ) and G,p{sf') hold. By the induction hypothesis 
we get IC, \= 4' and IC, sf |= ip- Let tt = so, si, S2, . . . be an arbitrary path with initial state so. The 
fact that So has both a left successor sf and a right successor sf implies that either sf or sf is the second 
state of every such path. Suppose that there exists a path tt = so, si, S2, . . . with initial state so such that 
/C,7r ^ (p. If that were the case, then IC,7t^ ^ tjj, where tt^ = si, S2, . . . . But si = sf or si = sf , which 
means that /C, sf ^ V or IC, sf ^ ip, either of which contradicts the induction hypothesis. 
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5. li (p = E(-0iUV'2), then the corresponding program is this of Definition 17.21 f3). 

Suppose that sq € G^p{D). From the rules of the program 11^ we see that there exists a Si (possibly 
Si = So) such that G^^ (s^) holds. Further, there exists a sequence of states sq, si, . . . , Si such that for every 
r (0 < r < i) G'0j(s, ) and at least one of So{sr, Sr+i) or Si{sr, Sr+i) is true. By the induction hypothesis 
we get /C, Si \= -02 and /C, sj \= ipi (0 < j < z — 1). Let tt = sq, si, S2, . . . , s;, ... be any path with initial 
segment sq, si, . . . , s^; then /C, tt' \= ip2 and /C, tt^ H V'l (0 < J < « — 1), i.e., /C, tt |= (^. 

6. li ip = A(V'iU'02) then the corresponding program is this of Definition 17.21 fS) . 

Let us assume now that sq G G^{D). Let us define G^p{D, n) to be the set of ground facts for G^p that have 
been computed in the first n rounds of the evaluation of program 11^. For more details in the bottom-up 
evaluation of Datalog programs see jU1188| . We shall prove that for every t S G^{D, n), JC,Tr \= ip for every 
path TT = to, ti, t2, . . . with initial state to = t. We use induction on the number of rounds n. 

(a) If 71 = 1, then t appears in G^{D) due to the first rule of H^, i.e., t G Gtp^{D). By the induction 
hypothesis with respect to "02 we get that lC,t ^ '021 which trivially implies that K,,!: \= p for every path 
TT = tg, ti, i2i ■ • ■ with initial state to = t. 

(b) We show now that the claim holds for n + 1, assuming that it holds for n. We examine only the case 
where node t has exactly two successors and t^, since the case where t has only one successor is identical. 
Without loss of generality we may assume that t first appeared in G^{D, n + 1) during round n + 1. This 
must have happened due to the third rule of Hi^: Gip{x) < — G^j(x), So{x,y), 5'i(a;,z), G^pijj), G^{z). 
This implies that t G Gip^{D) and both t^ and t^ belong to G^{D,n). Hence, by invoking the induction 
hypothesis with respect to ipi we get that XI, i ^ -01, and by the induction hypothesis with respect to the 
number of rounds we get that /C, tt^'^ \= ip for every path tt^'^ = if ,^2 , . . . with initial state t^ ~ t^ 
and IC,Tr^'^ \= (p for every path tt^'^ = tf . . . with initial state tf = t^ . By combining all these, we 
conclude that JC^n \== ip for every path tt = toi ^ii ^2, • ■ • with initial state to ~ t- 

Note that the bottom-up evaluation of Datalog programs guarantees that there exists n G N such that 
G^{D, n) = G^{D, r) for every r > n, i.e., G^{D) = G^{D, n). 

7. li Lp = E{ipiJJip2), then the corresponding program is shown in Definition 17.21 (3'). 

Let us assume that sq G G^{D). Let us define G^{D, n) to be the set of ground facts for G^ that have been 
computed in the first n rounds of the evaluation of program 11;^. We shall prove that for every t G Gip{D, n), 
there exists a path tt ~ to, ti,t2, ■ ■ ■ with initial state to ~ such that /C, tt |= <y9. We use induction on the 
number of rounds n. 

(a) If n = 1, then t appears in G^p{D) due to either the first rule, i.e., t G Gjp^{D) n G^,.2{D), or to the 
third rule, i.e., (t, t) G B^^{D). In the first case, the induction hypothesis pertaining to 0i and V'2, implies 
that /C,t ^ V'l ^ 02, which immediately implies that /C, tt |= (yS for any path tt = to,ti,t2, . . . with initial 
state to = t. In the second case, there is a finite sequence to, ti, . . . , tfc of states, such that to = tk = t and 
tj G G,p^{D), < j < k. Thus, by the induction hypothesis, IC,tj \= 02, < j < k. Consider the path 
TT = (to, ti, . . . , tfc)"; for this path we have ]C,tt [=: p>. 

(b) We show now that the claim holds for n + 1, assuming that it holds for n. We focus on the case 
where node t has exactly two successors t^ and t^ (the case where t has only one successor is similar). 
We may further assume that t first appeared in G^{D,n -\- 1) during round n + 1. This can only have 
occurred because of the fourth or fifth rule of 11^. Then t G G,i,^{D) and at least one of t^ and t^ belongs 
to Gip{D,n). Without loss of generality, we assume that t^ G G^{D,n). By the induction hypothesis, we 
know that IC,t \= ip2 and that there exists a path tt^ = ti,t2, . . . with initial state ti = t^, such that 
/C, TT^ \= ip. Immediately then we conclude that /C, tt |= (/?, for the path tt = to, ti, t2, . . . with to = t. 

Note that the bottom-up evaluation of Datalog programs guarantees that there exists n G N such that 
G^{D,n) = G^{D,r) for every r > n, i.e., G^{D) = G^{D,n). 

8. li if = A('0iU'02)j where ipi and "02 are state formulae, then the corresponding program is shown in 
Definition O (3). 

Let us suppose now that so G G^{D). Let us define G^{D,k) to be the set of ground facts for G^ that 
have been computed after k rounds of the evaluation of program 11^. We shall prove with simultaneous 
induction on the number of rounds k two things: 
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(1) Let s G G^{D, k) and let tt = sq, Si, . . . be an arbitrary path with initial state Sq = s; then /C, tt ^ 

(2) Let t be a state such that C^^ (t, fc) holds (here of course k < \W\) and let g = to, ti, ... ,tk, ... be an 
arbitrary path with initial state to = t; then either IC, g\=^ ViUf/'2 or IC, tj 1= V'2, for < j < k. 

(a) If fc = 1, then s appears in G^{D) due to the first rule of H^, i.e., s G G^^{D) n G^^{D). Hence, 
/C, TT ^ V'iU';/'2, where tt = sqi ^i, . . . is any path with initial state sq — s. We assume of course that \W\ > 1 
because the case where \W\ ~ 1, that is the database contains only one element, is trivial. Similarly, for 
any state t, if G^^ (t, 1) holds then G^^ i^y 1) '^^^ only be derived by the tenth or eleventh rule of . In any 
case, these rules imply that for every path tt = to,ti, . . . with initial state Iq = t we have that /C, to |= 4^2 
and /C, ti [= ip2- 

(b) We show now that the claim holds for k + 1, assuming that it holds for k. We consider the case where 
states s and t have exactly two successors s^,s^ and t^,t^, respectively. 

(i) Initially, we shall consider the case where fc + 1 < \ W\. 

We may assume that s first appeared in G^{D^k + 1) during round k + 1. It is important to stress 
that in this case, s cannot arise from an application of the fifth rule of 11^ because the first time this may 
happen is at round \W\ + 1. This imphes that s must have appeared because of the fourth rule G^{x) < — 
0^2(2;), So{x,y), Si{x,z), G^{y), G^{z). Hence, s £ G.ii^^iD) and both of and belong to G^{D,k). 
Then, by the induction hypothesis, we get that IC,s \= 1^2, ^, tt^'^ \= f for every path tt^'^ = , , . . . 
with initial state sf = and K,, tt^'^ ^ (p for every path tt^'^' = sf-, s|^, . . . with initial state sf — t^. 
Therefore, /C, tt |= (ys for every path tt = so, si, S2, . . . with initial state so — s. 

Moreover, if G^^{t, fc + 1) holds, then G:,i,^{t) holds and one of the next must also hold: 

- G^^{t^,k) and G^^it^,k), 

- G^(t^) and G^^{t^,k), or 

- G^^{t^,k) and G^{t"). 

Thus, by the induction hypothesis, we know that: 

- }C,t \= ip2, and 

- for every path g^''^ = tf, tj , • • ■ , tk+n ■ ■ ■ with initial state tf = either JC, g^'^ or /C, tj" 1= V'2 
(1 < j < fc + 1), and 

- for every path g^'^ = if jif , • ■ • j^fc+u ■ ■ ■ '^i*!^ initial state tf = cither /C, g^'^^ ^ (p or /C,tj^ \= i{j2 
(1 <.?■ < fc + 1). 

Taking all these into account, we conclude that for every path g = to,ti, t2, . . .,tfc+i, . . . with initial state 
to =t either IC, g \= or JC, tj \= tp2 (0 < j < fc + 1). 

(ii) Finally, we examine the case where fc + 1 > \W\. 

In this case, s may belong to G^p{D,k + 1) either due to the fourth rule (Gip{x) < — G^^i^)^ ^oi^iU), 
Si{x,z), Gip{y), Gtp{z)) or due to the fifth rule {Gip{x) < — G^^{x,Cmax))- If it is due to the fourth rule, 
then s G G^^{D) and both of and belong to G^{D,k), and the proof proceeds as in case (i) above. 
So, let us suppose that s occurs due to the fifth rule, i.e., G^^{s, \W\) is true. As we have already proved 
in case (i), this implies that given any path tt = sq,si,S2, ■ ■ ■ ,S[iv|j ■ • ■ with initial state sq = s, either 
/C,7r 1= '0iUV'2 or IC,Sj \= 'ip2-: for < j < \W\. If /C, tt \= if for every path tt with initial state sq = s, 
then we are finished because this is exactly what we have to prove. If however this is not the case, then 
there must be a path g = sq, si, S2, ■ ■ ■ , s\w\-i ■ ■ ■ with initial state sq ~ s such that K., g ^ •ipi\Jip2- But 
then for the path g = sq, si, S2, ■ ■ ■ , S|vf|, ... we would have that K., Sj ^ 'ip2, for < j < \W\ (*). Now 

IC,g ^ V'iU'02 ^ IC,g ^ -i(?/'iUV'2) ^ lC,g ^ -iV'iU-iV'2 ^ there exists i > such that IC, g^ |= -i?/'2 and 
for every j, < j < i, IC, g^ \= -iipi (**). The fact that IC, g^ \= -iip2 immediately implies that IC, Si \= -i'ip2 
(* * *) (-02 is a state formula). If i < \W\, then a contradiction is immediate because we would have 
/C, s,; \= -iip2 and IC,Si \= ip2 due to (*). Hence, it remains to examine the case where i > \W\. Let us 
examine the initial segment Sqj ^2, • . . , S|m/| , ■ ■ ■ , si ol g; from ProDOsition l9.3l we know that in this initial 
segment there exists a state s' such that s' = si — sii , < I < I' < i. So we can get the sorter initial 
segment to, . . . , ti_i, t;, t;+i, . . . , ti_(;/_;), where t^ = s^, <r <l and t,. = Sr+(;'_i), I + 1 < r < i - {I' - 1). 
Now if i — {I' — /) is less than or equal to \ W\, we stop; otherwise we keep applying the same technique until 
we eventually produce an initial segment to, . . . ,tm, where to = s, tm — Si and m < \W\. Consider the 
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path a = to, ... , tm, Q^'^^ (a shortened version of g); we know from (**) that IC, cr™ \= ^■02 and for every 
j, < j < m. /C, (7^ 1= -^ipi, which implies that IC,a ^ ■i/;iU-02. The only other possibility left for a is that 
IC,tj \= ip2, for < j < \W\ and, thus, /C, Sj |= ip2, which contradicts (* * *). This concludes the proof 
that ICjiT \= if for every path tt with initial state sq ~ s. The bottom-up evaluation of Datalog programs 
guarantees that there exists n G N such that Gip{D, n) ~ Gip{D, r) for every r > n, i.e., G^{D) = G^{D, n). 
H 
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